-
公开(公告)号:US10509666B2
公开(公告)日:2019-12-17
申请号:US15637810
申请日:2017-06-29
Applicant: ATI Technologies ULC
Inventor: Anthony Asaro , Yinan Jiang , Kelly Donald Clark Zytaruk
Abstract: A register protection mechanism for a virtualized accelerated processing device (“APD”) is disclosed. The mechanism protects registers of the accelerated processing device designated as physical-function-or-virtual-function registers (“PF-or-VF* registers”), which are single architectural instance registers that are shared among different functions that share the APD in a virtualization scheme whereby each function can maintain a different value in these registers. The protection mechanism for these registers comprises comparing the function associated with the memory address specified by a particular register access request to the “currently active” function for the APD and disallowing the register access request if a match does not occur.
-
公开(公告)号:US10474490B2
公开(公告)日:2019-11-12
申请号:US15637800
申请日:2017-06-29
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Gongxian Jeffrey Cheng , Louis Regniere , Anthony Asaro
Abstract: A technique for efficient time-division of resources in a virtualized accelerated processing device (“APD”) is provided. In a virtualization scheme implemented on the APD, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD performs a virtualization context switch by stopping operations for a current virtual machine (“VM”) and starting operations for another VM. Typically, each VM is assigned a fixed length of time, after which a virtualization context switch is performed. This fixed length of time can lead to inefficiencies. Therefore, in some situations, in response to a VM having no more work to perform on the APD and the APD being idle, a virtualization context switch is performed “early.” This virtualization context switch is “early” in the sense that the virtualization context switch is performed before the fixed length of time for the time-slice expires.
-
公开(公告)号:US10324860B2
公开(公告)日:2019-06-18
申请号:US15695683
申请日:2017-09-05
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Anthony Asaro , Kevin Normoyle , Mark Hummel
IPC: G06F12/10 , G06F12/1036 , G06F12/08 , G06F12/06 , G06F12/02 , G06F12/109
Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.
-
公开(公告)号:US20190018699A1
公开(公告)日:2019-01-17
申请号:US15663499
申请日:2017-07-28
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Anthony Asaro , Yinan Jiang , Andy Sung , Ahmed M. Abdelkhalek , Xiaowei Wang , Sidney D. Fortes
Abstract: A technique for recovering from a hang in a virtualized accelerated processing device (“APD”) is provided. In the virtualization scheme, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD stops operations for a current VM and starts operations for another VM. To stop operations on the APD, a virtualization scheduler sends a request to idle the APD. The APD responds by completing work and idling. If one or more portions of the APD do not complete this idling process before a timeout expires, then a hang occurs. In response to the hang, the virtualization scheduler informs the hypervisor that a hang has occurred. The hypervisor performs a function level reset on the APD and informs the VM that the hang has occurred. The VM responds by stopping command issue to the APD and re-initializing the APD for the function.
-
公开(公告)号:US20170212760A1
公开(公告)日:2017-07-27
申请号:US15353161
申请日:2016-11-16
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Meenakshi Sundaram Bhaskaran , Elliot H. Mednick , David A. Roberts , Anthony Asaro , Amin Farmahini-Farahani
IPC: G06F9/30 , G06F12/0817 , G06F12/0875 , G06F9/38
CPC classification number: G06F9/30043 , G06F9/3005 , G06F9/3016 , G06F9/3802 , G06F12/084 , G06F12/0862 , G06F12/0875 , G06F12/1027 , G06F2212/1024 , G06F2212/452 , G06F2212/6028
Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.
-
公开(公告)号:US20170083455A1
公开(公告)日:2017-03-23
申请号:US14861055
申请日:2015-09-22
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Philip J. Rogers , Benjamin T. Sander , Anthony Asaro
CPC classification number: G06F12/121 , G06F12/0891 , G06F12/0895 , G06F12/1081 , G06F12/12 , G06F12/127 , G06F13/28 , G06F2212/656
Abstract: A processor device includes a cache and a memory storing a set of counters. Each counter of the set is associated with a corresponding block of a plurality of blocks of the cache. The processor device further includes a cache access monitor to, for each time quantum for a series of one or more time quanta, increment counter values of the set of counters based on accesses to the corresponding blocks of the cache. The processor device further includes a transfer engine to, after completion of each time quantum, transfer the counter values of the set of counters for the time quantum to a corresponding location in a system memory.
-
27.
公开(公告)号:US20160378674A1
公开(公告)日:2016-12-29
申请号:US14747944
申请日:2015-06-23
Applicant: ATI Technologies ULC , Advanced Micro Devices, Inc.
Inventor: Gongxian Jeffrey Cheng , Mark Fowler , Philip J. Rogers , Benjamin T. Sander , Anthony Asaro , Mike Mantor , Raja Koduri
IPC: G06F12/10
CPC classification number: G06F12/1009 , G06F12/1072 , G06F12/1081 , G06F15/163 , G06F2212/1016 , G06F2212/151 , G06F2212/152 , G06F2212/251
Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.
Abstract translation: 处理器对处理器的异构处理单元使用相同的虚拟地址空间。 处理器对不同类型的处理单元(例如CPU和GPU)采用不同的页表,其中存储器管理单元使用每组页表来将虚拟地址空间的虚拟地址转换为存储器模块的相应物理地址 与处理器相关联。 随着数据在内存模块之间迁移,可以更新页表中的物理地址,以反映每个处理单元的数据的物理位置。
-
公开(公告)号:US20140380028A1
公开(公告)日:2014-12-25
申请号:US13923513
申请日:2013-06-21
Applicant: ATI Technologies ULC
Inventor: Gongxian Jeffrey CHENG , Anthony Asaro , Yinan Jiang
CPC classification number: G06F9/45558 , G06F1/24 , G06F9/45533 , G06F11/1441 , G06F2009/4557 , G06F2009/45575 , G06F2009/45591
Abstract: In a hardware-based virtualization system, a hypervisor switches out of a first function into a second function. The first function is one of a physical function and a virtual function and the second function is one of a physical function and a virtual function. During the switching a malfunction of the first function is detected. The first function is reset without resetting the second function. The switching, detecting, and resetting operations are performed by a hypervisor of the hardware-based virtualization system. Embodiments further include a communication mechanism for the hypervisor to notify a driver of the function that was reset to enable the driver to restore the function without delay.
Abstract translation: 在基于硬件的虚拟化系统中,管理程序将第一个功能切换到第二个功能。 第一个功能是物理功能和虚拟功能之一,第二个功能是物理功能和虚拟功能之一。 在切换期间,检测到第一功能的故障。 第一个功能在不重置第二个功能的情况下被复位。 切换,检测和重置操作由基于硬件的虚拟化系统的管理程序执行。 实施例还包括用于管理程序的通信机制,以通知驾驶员已经重置的功能,以使得驾驶员能够无延迟地恢复功能。
-
公开(公告)号:US20250156336A1
公开(公告)日:2025-05-15
申请号:US18389021
申请日:2023-11-13
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Christopher J. Brennan , Mark Fowler , Vydhyanathan Kalyanasundharam , Anthony Asaro
IPC: G06F12/1009
Abstract: Systems and techniques enable intermingled use of disparate addressing modes for memory access requests directed to system memory resources. Within a processing system, a memory access request indicating a multi-bit physical memory address is received. Based on a bit pattern indicated by a first subset of bits of the multi-bit physical memory address, an addressing mode to be used for fulfilling the memory access request is determined, such as by selecting an addressing mode table entry that is keyed to the bit pattern. The memory access request is fulfilled in accordance with the determined addressing mode.
-
公开(公告)号:US20250110893A1
公开(公告)日:2025-04-03
申请号:US18478757
申请日:2023-09-29
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: John Szeto , Anthony Asaro , Kostantinos Danny Christidis , Wade K. Smith
IPC: G06F12/1027 , G06F12/0873
Abstract: An apparatus and method for efficiently performing address translation requests. An integrated circuit includes a system memory that stores address mappings, and the circuitry of one or more clients processes one or more applications and generate address translation requests. A translation lookaside buffer (TLB) stores, in multiple entries, address mappings retrieved from the system memory. Circuitry of a client processes one or more applications and generates address translation requests. The entries of the TLB stores address mappings corresponding to different address mapping types and different virtual functions to avoid searches of multiple other lower-level TLBs that are significantly larger and have larger access. In addition, the TLB is implemented with a relatively small number of entries and uses fully associative data storage arrangement to further reduce access latencies.
-
-
-
-
-
-
-
-
-