Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Kevin M. Lepak"

21.

发明授权
Region based split-directory scheme to adapt to large cache sizes 有权

公开(公告)号：US10705959B2

公开(公告)日：2020-07-07

申请号：US16119438

申请日：2018-08-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Amit P. Apte , Ganesh Balakrishnan

IPC: G06F12/0817

Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.

22.

发明授权
Reducing chiplet wakeup latency 有权

公开(公告)号：US10656696B1

公开(公告)日：2020-05-19

申请号：US15907719

申请日：2018-02-28

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Benjamin Tsien , Michael J. Tresidder , Ivan Yanfeng Wang , Kevin M. Lepak , Ann Ling , Richard M. Born , John P. Petry , Bryan P. Broussard , Eric Christopher Morton

IPC: G06F1/32 , G06F1/3206 , G06F1/3287 , G06F1/3234

Abstract: Systems, apparatuses, and methods for reducing chiplet interrupt latency are disclosed. A system includes one or more processing nodes, one or more memory devices, a communication fabric coupled to the processing unit(s) and memory device(s) via link interfaces, and a power management unit. The power management unit manages the power states of the various components and the link interfaces of the system. If the power management unit detects a request to wake up a given component, and the link interface to the given component is powered down, then the power management unit sends an out-of-band signal to wake up the given component in parallel with powering up the link interface. Also, when multiple link interfaces need to be powered up, the power management unit powers up the multiple link interfaces in an order which complies with voltage regulator load-step requirements while minimizing the latency of pending operations.

23.

发明授权
Tag and data organization in large memory caches 有权

公开(公告)号：US10366008B2

公开(公告)日：2019-07-30

申请号：US15376275

申请日：2016-12-12

Applicant: Advanced Micro Devices, Inc.

Inventor： Ganesh Balakrishnan , Vydhyanathan Kalyanasundharam , Kevin M. Lepak

IPC: G06F12/08 , G06F12/0853 , G06F12/0811 , G06F12/084

Abstract: A data processing system includes a processor and a cache controller coupled to the processor, and adapted to be coupled to a memory. The cache controller uses the memory to form a pseudo direct mapped cache having a plurality of groups of pages. The memory forms a first number of selected pages, including a first page for storing a plurality of sets of tags and a plurality of remaining pages for storing data. Each tag, of the plurality of sets of tags, stores tags for respective entries in a corresponding one of the plurality of remaining pages.

24.

发明申请
TAG ACCELERATOR FOR LOW LATENCY DRAM CACHE 审中-公开

公开(公告)号：US20190196974A1

公开(公告)日：2019-06-27

申请号：US15855838

申请日：2017-12-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Ganesh Balakrishnan , Ravindra N. Bhargava

IPC: G06F12/0897 , G06F12/121

Abstract: Systems, apparatuses, and methods for implementing a tag accelerator cache are disclosed. A system includes at least a data cache and a control unit coupled to the data cache via a memory controller. The control unit includes a tag accelerator cache (TAC) for caching tag blocks fetched from the data cache. The data cache is organized such that multiple tags are retrieved in a single access. This allows hiding the tag latency penalty for future accesses to neighboring tags and improves cache bandwidth. When a tag block is fetched from the data cache, the tag block is cached in the TAC. Memory requests received by the control unit first lookup the TAC before being forwarded to the data cache. Due to the presence of spatial locality in applications, the TAC can filter out a large percentage of tag accesses to the data cache, resulting in latency and bandwidth savings.

25.

发明申请
REGION BASED DIRECTORY SCHEME TO ADAPT TO LARGE CACHE SIZES 审中-公开

公开(公告)号：US20190188137A1

公开(公告)日：2019-06-20

申请号：US15846008

申请日：2017-12-18

Applicant: Advanced Micro Devices, Inc.

Inventor： Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Amit P. Apte , Ganesh Balakrishnan , Eric Christopher Morton , Elizabeth M. Cooper , Ravindra N. Bhargava

IPC: G06F12/0817 , G06F12/128 , G06F12/0811 , G06F12/0831 , G06F12/0871

CPC classification number: G06F12/0817 , G06F12/0811 , G06F12/0831 , G06F12/0871 , G06F12/128 , G06F2212/283 , G06F2212/604 , G06F2212/621

Abstract: Systems, apparatuses, and methods for maintaining a region-based cache directory are disclosed. A system includes multiple processing nodes, with each processing node including a cache subsystem. The system also includes a cache directory to help manage cache coherency among the different cache subsystems of the system. In order to reduce the number of entries in the cache directory, the cache directory tracks coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Accordingly, the system includes a region-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the system. The cache directory includes a reference count in each entry to track the aggregate number of cache lines that are cached per region. If a reference count of a given entry goes to zero, the cache directory reclaims the given entry.

26.

发明授权
Speculative retirement of post-lock instructions 有权

公开(公告)号：US10095637B2

公开(公告)日：2018-10-09

申请号：US15267094

申请日：2016-09-15

Applicant: Advanced Micro Devices, Inc.

Inventor： Gregory W. Smaus , John M. King , Michael D. Achenbach , Kevin M. Lepak , Matthew A. Rafacz , Noah Bamford

IPC: G06F12/00 , G06F13/00 , G06F13/28 , G06F12/14 , G06F9/52 , G06F9/30

Abstract: Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.

27.

发明申请
SPECULATIVE RETIREMENT OF POST-LOCK INSTRUCTIONS 审中-公开

公开(公告)号：US20180074977A1

公开(公告)日：2018-03-15

申请号：US15267094

申请日：2016-09-15

Applicant: Advanced Micro Devices, Inc.

Inventor： Gregory W. Smaus , John M. King , Michael D. Achenbach , Kevin M. Lepak , Matthew A. Rafacz , Noah Bamford

IPC: G06F12/14 , G06F9/52 , G06F9/30

CPC classification number: G06F12/1466 , G06F9/3004 , G06F9/30043 , G06F9/30087 , G06F9/3834 , G06F9/3859 , G06F9/3863 , G06F9/528 , G06F2212/1052

Abstract: Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.

28.

发明申请
METHOD AND APPARATUS FOR ACCELERATED SHARED DATA MIGRATION 审中-公开
Title translation: 用于加速共享数据迁移的方法和装置

公开(公告)号：US20140229678A1

公开(公告)日：2014-08-14

申请号：US14258289

申请日：2014-04-22

Applicant: Advanced Micro Devices, Inc.

Inventor： Kevin M. Lepak , Vydhyanathan Kalyanasundharam , William A. Hughes , Benjamin Tsien , Greggory D. Donley

IPC: G06F12/08

CPC classification number: G06F12/084 , G06F12/0822 , G06F12/0833 , Y02D10/13

Abstract: A method and apparatus for accelerated shared data migration between cores, Using an Always Migrate protocol, when a migratory probe hits a directory entry in either modified or owned state, the entry is transitioned to an owned state, and a source done command is sent without sending cache block ownership or state information to the directory.

Abstract translation: 一种用于在内核之间加速共享数据迁移的方法和装置，使用始终迁移协议，当迁移探测器以修改或拥有状态访问目录条目时，该条目将转换为所有状态，并且发送源完成命令而不发送源完成命令将缓存块所有权或状态信息发送到目录。

29.

发明授权
Methods and apparatus for offloading tiered memories management 有权

公开(公告)号：US12131063B2

公开(公告)日：2024-10-29

申请号：US17219138

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Kevin M. Lepak

IPC: G06F3/00 , G06F3/06 , G06F12/0882 , G06F12/1009

CPC classification number: G06F3/0659 , G06F3/0619 , G06F3/0647 , G06F3/0679 , G06F12/0882 , G06F12/1009

Abstract: Methods and apparatus offload tiered memories management. The method includes obtaining a pointer to a stored memory management structure associated with tiered memories, where the memory management structure includes a plurality of memory management entries and each memory management entry of the plurality of memory management entries includes information for a memory section in one of the tiered memories. In some instances, the method includes scanning at least a part of the plurality of memory management entries. In certain instances, the method includes generating a memory profile list, where the memory profile list includes a plurality of profile entries and each profile entry of the plurality of profile entries corresponding to a scanned memory management entry in the memory management structure.

30.

发明公开
TIERED MEMORY CACHING 审中-公开

公开(公告)号：US20240220415A1

公开(公告)日：2024-07-04

申请号：US18091140

申请日：2022-12-29

Applicant: Advanced Micro Devices, Inc.

Inventor： Vydhyanathan Kalyanasundharam , Ganesh Balakrishnan , Kevin M. Lepak , Amit P. Apte

IPC: G06F12/0897

CPC classification number: G06F12/0897 , G06F2212/1016

Abstract: The disclosed computer-implemented method includes locating, from a processor storage, a partial tag corresponding to a memory request for a line stored in a memory having a tiered memory cache and in response to a partial tag hit for the memory request, locating, from a partition of the tiered memory cache indicated by the partial tag, a full tag for the line. The method also includes fetching, in response to a full tag hit, the requested line from the partition of the tiered memory cache. Various other methods, systems, and computer-readable media are also disclosed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification