Coherency directory entry allocation based on eviction costs

    公开(公告)号:US10705958B2

    公开(公告)日:2020-07-07

    申请号:US16108696

    申请日:2018-08-22

    Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.

    Setting operating points for circuits in an integrated circuit chip

    公开(公告)号:US10389251B2

    公开(公告)日:2019-08-20

    申请号:US16130136

    申请日:2018-09-13

    Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.

    DYNAMIC MEMORY POWER CAPPING WITH CRITICALITY AWARENESS

    公开(公告)号:US20190065243A1

    公开(公告)日:2019-02-28

    申请号:US15269341

    申请日:2016-09-19

    Inventor: Yasuko Eckert

    Abstract: Systems, apparatuses, and methods for reducing memory power consumption without substantial performance impact by selectively delaying non-critical memory requests are disclosed. A system management unit transfers an amount of power allocated from a memory subsystem to other component(s) responsive to detecting a first condition. In one embodiment, the first condition is detecting one or more processors having tasks to execute. In response to the system management unit transferring the amount of power from the memory subsystem to one or more processors, a memory controller delays non-critical memory requests while performing critical memory requests to memory.

    Proactive cache coherence
    95.
    发明授权

    公开(公告)号:US10162757B2

    公开(公告)日:2018-12-25

    申请号:US15370734

    申请日:2016-12-06

    Abstract: A distributed shared-memory system includes several nodes that each have one or more processor cores, caches, local main memory, and a directory. Each node further includes predictors that use historical memory access information to predict future coherence permission requirements and speculatively initiate coherence operations. In one embodiment, predictors are included at processor cores for monitoring a memory access stream (e.g., historical sequence of memory addresses referenced by a processor core) and predicting addresses of future accesses. In another embodiment, predictors are included at the directory of each node for monitoring memory access traffic and coherence-related activities for individual cache lines to predict future demands for particular cache lines. In other embodiments, predictors are included at both the processor cores and directory of each node. Predictions from the predictors are used to initiate coherence operations to speculatively request promotion or demotion of coherence permissions.

    HIERARCHICAL REGISTER FILE AT A GRAPHICS PROCESSING UNIT

    公开(公告)号:US20170278213A1

    公开(公告)日:2017-09-28

    申请号:US15079543

    申请日:2016-03-24

    Abstract: A processor employs a hierarchical register file for a graphics processing unit (GPU). A top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is active, predicted to be become active, or inactive. The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file.

    Virtual memory mapping for improved DRAM page locality

    公开(公告)号:US09710392B2

    公开(公告)日:2017-07-18

    申请号:US14460550

    申请日:2014-08-15

    Abstract: Embodiments are described for methods and systems for mapping virtual memory pages to physical memory pages by analyzing a sequence of memory-bound accesses to the virtual memory pages, determining a degree of contiguity between the accessed virtual memory pages, and mapping sets of the accessed virtual memory pages to respective single physical memory pages. Embodiments are also described for a method for increasing locality of memory accesses to DRAM in virtual memory systems by analyzing a pattern of virtual memory accesses to identify contiguity of accessed virtual memory pages, predicting contiguity of the accessed virtual memory pages based on the pattern, and mapping the identified and predicted contiguous virtual memory pages to respective single physical memory pages.

Patent Agency Ranking