-
公开(公告)号:US11481250B2
公开(公告)日:2022-10-25
申请号:US16024244
申请日:2018-06-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexandru Dutu , Matthew David Sinclair , Bradford Beckmann , David A. Wood
Abstract: A first workgroup is preempted in response to threads in the first workgroup executing a first wait instruction including a first value of a signal and a first hint indicating a type of modification for the signal. The first workgroup is scheduled for execution on a processor core based on a first context after preemption in response to the signal having the first value. A second workgroup is scheduled for execution on the processor core based on a second context in response to preempting the first workgroup and in response to the signal having a second value. A third context it is prefetched into registers of the processor core based on the first hint and the second value. The first context is stored in a first portion of the registers and the second context is prefetched into a second portion of the registers prior to preempting the first workgroup.
-
公开(公告)号:US20170371784A1
公开(公告)日:2017-12-28
申请号:US15192542
申请日:2016-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan R. Alsop , Bradford Beckmann
IPC: G06F12/0804 , G06F12/0811 , G06F12/084 , G06F12/0808 , G06F12/0842 , G06F12/0891 , G06F12/0897
CPC classification number: G06F12/0897 , G06F12/0808 , G06F12/0811 , G06F12/0842 , G06F12/0891 , G06F2212/6042
Abstract: A processing system includes one or more first caches and one or more first lock tables associated with the one or more first caches. The processing system also includes one or more processing units that each include a plurality of compute units for concurrently executing work-groups of work items, a plurality of second caches associated with the plurality of compute units and configured in a hierarchy with the one or more first caches, and a plurality of second lock tables associated with the plurality of second caches. The first and second lock tables indicate locking states of addresses of cache lines in the corresponding first and second caches on a per-line basis.
-
公开(公告)号:US20240095180A1
公开(公告)日:2024-03-21
申请号:US18088170
申请日:2022-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Gabriel H. Loh , Michael Estlick , Jay Fleischman , Michael J. Schulte , Bradford Beckmann , Yasuko Eckert
IPC: G06F12/1009
CPC classification number: G06F12/1009 , G06F2212/1008
Abstract: The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US11868809B2
公开(公告)日:2024-01-09
申请号:US18095704
申请日:2023-01-11
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Muhammad Amber Hassaan , Anirudh Mohan Kaushik , Sooraj Puthoor , Gokul Subramanian Ravi , Bradford Beckmann , Ashwin Aji
IPC: G06F9/46 , G06F9/48 , G06F9/52 , G06F16/901
CPC classification number: G06F9/4881 , G06F9/52 , G06F16/9024 , G06F2209/486
Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.
-
公开(公告)号:US11526449B2
公开(公告)日:2022-12-13
申请号:US17007133
申请日:2020-08-31
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Johnathan Alsop , Pouya Fotouhi , Bradford Beckmann , Sergey Blagodurov
IPC: G06F12/08 , G06F12/0891 , G06F9/30 , G06F12/0882 , G06F12/0811
Abstract: A processing system limits the propagation of unnecessary memory updates by bypassing writing back dirty cache lines to other levels of a memory hierarchy in response to receiving an indication from software executing at a processor of the processing system that the value of the dirty cache line is dead (i.e., will not be read again or will not be read until after it has been overwritten). In response to receiving an indication from software that data is dead, a cache controller prevents propagation of the dead data to other levels of memory in response to eviction of the dead data or flushing of the cache at which the dead data is stored.
-
公开(公告)号:US12131199B2
公开(公告)日:2024-10-29
申请号:US17029935
申请日:2020-09-23
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexandru Dutu , Matthew David Sinclair , Bradford Beckmann , David A. Wood
CPC classification number: G06F9/522 , G06F9/3005 , G06F9/461 , G06F11/3024 , G06F11/3476 , G06F11/3495 , G06N20/00
Abstract: A processing system monitors and synchronizes parallel execution of workgroups (WGs). One or more of the WGs perform (e.g., periodically or in response to a trigger such as an indication of oversubscription) a waiting atomic instruction. In response to a comparison between an atomic value produced as a result of the waiting atomic instruction and an expected value, WGs that fail to produce a correct atomic value are identified as being in a waiting state (e.g., waiting for a synchronization variable). Execution of WGs in the waiting state is prevented (e.g., by a context switch) until corresponding synchronization variables are released.
-
公开(公告)号:US20230393855A1
公开(公告)日:2023-12-07
申请号:US17833504
申请日:2022-06-06
Applicant: Advanced Micro Devices, Inc.
Inventor: Gabriel H. Loh , Yasuko Eckert , Bradford Beckmann , Michael Estlick , Jay Fleischman
CPC classification number: G06F9/3887 , G06F9/3877 , G06F9/30098 , G06F9/3555
Abstract: An approach is provided for implementing register based single instruction, multiple data (SIMD) lookup table operations. According to the approach, an instruction set architecture (ISA) can support one or more SIMD instructions that enable vectors or multiple values in source data registers to be processed in parallel using a lookup table or truth table stored in one or more function registers. The SIMD instructions can be flexibly configured to support functions with inputs and outputs of various sizes and data formats. Various approaches are also described for supporting very large lookup tables that span multiple registers.
-
公开(公告)号:US11740791B2
公开(公告)日:2023-08-29
申请号:US17497286
申请日:2021-10-08
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Seyed Mohammad Seyedzadehdelcheh , Xianwei Zhang , Bradford Beckmann , Shomit N. Das
IPC: G06F3/06 , G06F12/0875 , G06T1/20
CPC classification number: G06F3/0608 , G06F3/064 , G06F3/0659 , G06F3/0673 , G06F12/0875 , G06F2212/1044 , G06T1/20
Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
-
公开(公告)号:US11734059B2
公开(公告)日:2023-08-22
申请号:US16824601
申请日:2020-03-19
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Muhammad Amber Hassaan , Anirudh Mohan Kaushik , Sooraj Puthoor , Gokul Subramanian Ravi , Bradford Beckmann , Ashwin Aji
IPC: G06F9/46 , G06F9/48 , G06F9/52 , G06F16/901
CPC classification number: G06F9/4881 , G06F9/52 , G06F16/9024 , G06F2209/486
Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.
-
公开(公告)号:US20250103395A1
公开(公告)日:2025-03-27
申请号:US18476071
申请日:2023-09-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Bradford Beckmann , Matthew David Sinclair , Vinay Bharadwaj Ramakrishnaiah , William Peter Ehrett
IPC: G06F9/50
Abstract: A computer-implemented method for dynamic resource management can include evaluating, by at least one processor, whether a priority of one or more processes associated with a request for one or more shared resources meets a threshold condition. The method can additionally include determining, by the at least one processor and in response to an evaluation that the priority meets the threshold condition, whether the one or more shared resources is available to meet the request. The method can further include completing, by the at least one processor and in response to a determination that the one or more shared resources is available, execution of the one or more processes. Various other methods, systems, and computer-readable media are also disclosed.
-
-
-
-
-
-
-
-
-