-
公开(公告)号:US20220269620A1
公开(公告)日:2022-08-25
申请号:US17666974
申请日:2022-02-08
Applicant: ADVANCED MICRO DEVICES, INC. , ATI Technologies ULC
Inventor: Benjamin T. SANDER , Mark Fowler , Anthony Asaro , Gongxian Jeffrey Cheng , Michael Mantor
IPC: G06F12/1027 , G06F12/0893
Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.
-
公开(公告)号:US20210406196A1
公开(公告)日:2021-12-30
申请号:US17471552
申请日:2021-09-10
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Anthony Asaro , Kevin Normoyle , Mark Hummel
IPC: G06F12/1036 , G06F12/08 , G06F12/06 , G06F12/02 , G06F12/109
Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a plurality of processors. The method includes receiving a memory operation from a processor that references an address in a shared memory, mapping the received memory operation to at least one virtual memory pool to produce a mapping result, and providing the mapping result to the processor.
-
公开(公告)号:US20210011760A1
公开(公告)日:2021-01-14
申请号:US16938381
申请日:2020-07-24
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Anirudh R. Acharya , Michael J. Mantor , Rex Eldon McCrary , Anthony Asaro , Jeffrey Gongxian Cheng , Mark Fowler
Abstract: Systems, apparatuses, and methods for abstracting tasks in virtual memory identifier (VMID) containers are disclosed. A processor coupled to a memory executes a plurality of concurrent tasks including a first task. Responsive to detecting one or more instructions of the first task which correspond to a first operation, the processor retrieves a first identifier (ID) which is used to uniquely identify the first task, wherein the first ID is transparent to the first task. Then, the processor maps the first ID to a second ID and/or a third ID. The processor completes the first operation by using the second ID and/or the third ID to identify the first task to at least a first data structure. In one implementation, the first operation is a memory access operation and the first data structure is a set of page tables. Also, in one implementation, the second ID identifies a first application of the first task and the third ID identifies a first operating system (OS) of the first task.
-
公开(公告)号:US10423354B2
公开(公告)日:2019-09-24
申请号:US14863026
申请日:2015-09-23
Applicant: Advanced Micro Devices, Inc. , ATI TECHNOLOGIES ULC
Inventor: Philip Rogers , Benjamin T. Sander , Anthony Asaro , Gongxian Jeffrey Cheng
IPC: G06F3/06 , G06F13/28 , G06F12/1009
Abstract: A memory manager of a processor identifies a block of data for eviction from a first memory module to a second memory module. In response, the processor copies only those portions of the data block that have been identified as modified portions to the second memory module. The amount of data to be copied is thereby reduced, improving memory management efficiency and reducing processor power consumption.
-
公开(公告)号:US10339068B2
公开(公告)日:2019-07-02
申请号:US15495707
申请日:2017-04-24
Applicant: Advanced Mirco Devices, Inc. , ATI Technologies ULC
Inventor: Wade K. Smith , Anthony Asaro
IPC: G06F12/1045 , G06F12/1027 , G06F12/1009
Abstract: Systems, apparatuses, and methods for implementing a virtualized translation lookaside buffer (TLB) are disclosed herein. In one embodiment, a system includes at least an execution unit and a first TLB. The system supports the execution of a plurality of virtual machines in a virtualization environment. The system detects a translation request generated by a first virtual machine with a first virtual memory identifier (VMID). The translation request is conveyed from the execution unit to the first TLB. The first TLB performs a lookup of its cache using at least a portion of a first virtual address and the first VMID. If the lookup misses in the cache, the first TLB allocates an entry which is addressable by the first virtual address and the first VMID, and the first TLB sends the translation request with the first VMID to a second TLB.
-
公开(公告)号:US10223280B2
公开(公告)日:2019-03-05
申请号:US16025449
申请日:2018-07-02
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Vydhyanathan Kalyanasundharam , Yaniv Adiri , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien
IPC: G06F3/14 , G06F13/38 , G06F12/1009 , G06F12/12 , G06F12/1045
Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.
-
公开(公告)号:US10209991B2
公开(公告)日:2019-02-19
申请号:US15353161
申请日:2016-11-16
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Meenakshi Sundaram Bhaskaran , Elliot H. Mednick , David A. Roberts , Anthony Asaro , Amin Farmahini-Farahani
IPC: G06F9/30 , G06F9/38 , G06F12/084 , G06F12/0862 , G06F12/0875 , G06F12/1027
Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.
-
公开(公告)号:US10152434B2
公开(公告)日:2018-12-11
申请号:US15385566
申请日:2016-12-20
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Rostyslav Kyrychynskyi , Anthony Asaro , Kostantinos Danny Christidis , Mark Fowler , Michael J. Mantor , Robert Scott Hartog
Abstract: A system and method for efficient arbitration of memory access requests are described. One or more functional units generate memory access requests for a partitioned memory. An arbitration unit stores the generated requests and selects a given one of the stored requests. The arbitration unit identifies a given partition of the memory which stores a memory location targeted by the selected request. The arbitration unit determines whether one or more other stored requests access memory locations in the given partition. The arbitration unit sends each of the selected memory access request and the identified one or more other memory access requests to the memory to be serviced out of order.
-
公开(公告)号:US20180307619A1
公开(公告)日:2018-10-25
申请号:US16025449
申请日:2018-07-02
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Vydhyanathan Kalyanasundharam , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien , Yaniv Adiri
IPC: G06F12/1009 , G06F12/1045 , G06F12/12
CPC classification number: G06F12/1009 , G06F12/1045 , G06F12/12 , G06F2212/684
Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.
-
公开(公告)号:US20180300253A1
公开(公告)日:2018-10-18
申请号:US15486745
申请日:2017-04-13
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Wade K. Smith , Anthony Asaro , Dhirendra Partap Singh Rana
IPC: G06F12/1009 , G06F12/1027
Abstract: Systems, apparatuses, and methods for implementing a translate further mechanism are disclosed herein. In one embodiment, a processor detects a hit to a first entry of a page table structure during a first lookup to the page table structure. The processor retrieves a page table entry address from the first entry and uses this address to perform a second lookup to the page table structure responsive to detecting a first indication in the first entry. The processor retrieves a physical address from the first entry and uses the physical address to access the memory subsystem responsive to not detecting the first indication in the first entry. In one embodiment, the first indication is a translate further bit being set. In another embodiment, the first indication is a page directory entry as page table entry field not being activated.
-
-
-
-
-
-
-
-
-