-
公开(公告)号:US20200257623A1
公开(公告)日:2020-08-13
申请号:US16274146
申请日:2019-02-12
Applicant: Advanced Micro Devices, Inc.
Inventor: Jieming Yin , Yasuko Eckert , Matthew R. Poremba , Steven E. Raasch , Doug Hunt
IPC: G06F12/0802
Abstract: An electronic device handles memory access requests for data in a memory. The electronic device includes a memory controller for the memory, a last-level cache memory, a request generator, and a predictor. The predictor determines a likelihood that a cache memory access request for data at a given address will hit in the last-level cache memory. Based on the likelihood, the predictor determines: whether a memory access request is to be sent by the request generator to the memory controller for the data in parallel with the cache memory access request being resolved in the last-level cache memory, and, when the memory access request is to be sent, a type of memory access request that is to be sent. When the memory access request is to be sent, the predictor causes the request generator to send a memory request of the type to the memory controller.
-
公开(公告)号:US10705958B2
公开(公告)日:2020-07-07
申请号:US16108696
申请日:2018-08-22
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Michael W. Boyer , Gabriel H. Loh , Yasuko Eckert , William L. Walker
IPC: G06F12/12 , G06F12/0817 , G06F12/0842 , G06F11/30
Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.
-
公开(公告)号:US10389251B2
公开(公告)日:2019-08-20
申请号:US16130136
申请日:2018-09-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Wei Huang , Yasuko Eckert , Xudong An , Muhammad Shoaib Bin Altaf , Jieming Yin
Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
-
公开(公告)号:US20190065243A1
公开(公告)日:2019-02-28
申请号:US15269341
申请日:2016-09-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert
Abstract: Systems, apparatuses, and methods for reducing memory power consumption without substantial performance impact by selectively delaying non-critical memory requests are disclosed. A system management unit transfers an amount of power allocated from a memory subsystem to other component(s) responsive to detecting a first condition. In one embodiment, the first condition is detecting one or more processors having tasks to execute. In response to the system management unit transferring the amount of power from the memory subsystem to one or more processors, a memory controller delays non-critical memory requests while performing critical memory requests to memory.
-
公开(公告)号:US10162757B2
公开(公告)日:2018-12-25
申请号:US15370734
申请日:2016-12-06
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan Jayasena , Yasuko Eckert
IPC: G06F12/08 , G06F12/0815 , G06F12/084
Abstract: A distributed shared-memory system includes several nodes that each have one or more processor cores, caches, local main memory, and a directory. Each node further includes predictors that use historical memory access information to predict future coherence permission requirements and speculatively initiate coherence operations. In one embodiment, predictors are included at processor cores for monitoring a memory access stream (e.g., historical sequence of memory addresses referenced by a processor core) and predicting addresses of future accesses. In another embodiment, predictors are included at the directory of each node for monitoring memory access traffic and coherence-related activities for individual cache lines to predict future demands for particular cache lines. In other embodiments, predictors are included at both the processor cores and directory of each node. Predictions from the predictors are used to initiate coherence operations to speculatively request promotion or demotion of coherence permissions.
-
公开(公告)号:US20180113815A1
公开(公告)日:2018-04-26
申请号:US15331099
申请日:2016-10-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert , Bo Wu , Nuwan Jayasena , Dong Ping Zhang
IPC: G06F12/126 , G06F12/0808 , G06F12/0891
CPC classification number: G06F12/126 , G06F1/3275 , G06F12/127 , G06F2212/1016 , Y02D10/13 , Y02D10/14
Abstract: A processing system selects data for eviction at a cache based at least in part on a penalty associated with accessing the data at the memory location from which the data was transferred to the cache. The penalty reflects the amount of time and resources expended in copying the data from memory to the cache. By assigning priorities to the data stored at a cache based on the penalty incurred in accessing the data at the memory location from which it was transferred to the cache and selecting data for eviction from the cache based in part on the assigned priority, the processing system can preferentially select for eviction from the cache data that was transferred from a local memory to the cache rather than data that was transferred from a remote memory to the cache.
-
公开(公告)号:US09916265B2
公开(公告)日:2018-03-13
申请号:US14569825
申请日:2014-12-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov , Gabriel H. Loh , Yasuko Eckert
CPC classification number: G06F13/1694 , G06F11/3414 , G06F12/023 , G06F13/161 , G06F2212/1044 , Y02D10/14
Abstract: A system includes a plurality of memory classes and a set of one or more processing units coupled to the plurality of memory classes. The system further includes a data migration controller to select a traffic rate as a maximum traffic rate for transferring data between the plurality of memory classes based on a net benefit metric associated with the traffic rate, and to enforce the maximum traffic rate for transferring data between the plurality of memory classes.
-
公开(公告)号:US20170278213A1
公开(公告)日:2017-09-28
申请号:US15079543
申请日:2016-03-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert , Nuwan Jayasena
CPC classification number: G06T1/20 , G06F9/30123 , G06F9/30138 , G06F9/3851 , G06F9/3887 , G06T1/60
Abstract: A processor employs a hierarchical register file for a graphics processing unit (GPU). A top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is active, predicted to be become active, or inactive. The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file.
-
公开(公告)号:US09710392B2
公开(公告)日:2017-07-18
申请号:US14460550
申请日:2014-08-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Syed Ali Jafri , Yasuko Eckert , Srilatha Manne , Mithuna S Thottethodi
IPC: G06F12/10 , G06F12/1009
CPC classification number: G06F12/1009 , G06F2212/1024 , G06F2212/654 , G06F2212/655 , G06F2212/656
Abstract: Embodiments are described for methods and systems for mapping virtual memory pages to physical memory pages by analyzing a sequence of memory-bound accesses to the virtual memory pages, determining a degree of contiguity between the accessed virtual memory pages, and mapping sets of the accessed virtual memory pages to respective single physical memory pages. Embodiments are also described for a method for increasing locality of memory accesses to DRAM in virtual memory systems by analyzing a pattern of virtual memory accesses to identify contiguity of accessed virtual memory pages, predicting contiguity of the accessed virtual memory pages based on the pattern, and mapping the identified and predicted contiguous virtual memory pages to respective single physical memory pages.
-
公开(公告)号:US20170139635A1
公开(公告)日:2017-05-18
申请号:US14944099
申请日:2015-11-17
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan S. Jayasena , Yasuko Eckert
CPC classification number: G11C5/025 , G06F12/0811 , G06F12/0815 , G06F12/0888 , G06F2212/1016 , G06F2212/251 , G06F2212/254 , G06F2212/502 , G11C5/063
Abstract: A processing system includes a plurality of processor cores formed in a first layer of an integrated circuit device and a plurality of partitions of memory formed in one or more second layers of the integrated circuit device. The one or more second layers are deployed in a stacked configuration with the first layer. Each of the partitions is associated with a subset of the processor cores that have overlapping footprints with the partitions. The processing system also includes first memory paths between the processor cores and their corresponding subsets of partitions. The processing system further includes second memory paths between the processor cores and the partitions.
-
-
-
-
-
-
-
-
-