-
公开(公告)号:US11854139B2
公开(公告)日:2023-12-26
申请号:US17564160
申请日:2021-12-28
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Konstantin Igorevich Shkurko , Michael Mantor
CPC classification number: G06T15/06 , G06F9/4881 , G06T1/20 , G06T15/005 , G06T15/08 , G06T17/10
Abstract: A processing unit employs a hardware traversal engine to traverse an acceleration structure such as a ray tracing structure. The hardware traversal engine includes one or more memory modules to store state information and other data used for the structure traversal, and control logic to execute a traversal process based on the stored data and based on received information indicating a source node of the acceleration structure to be used for the traversal process. By employing a hardware traversal engine, the processing unit is able to execute the traversal process more quickly and efficiently, conserving processing resources and improving overall processing efficiency.
-
公开(公告)号:US11768664B2
公开(公告)日:2023-09-26
申请号:US16591031
申请日:2019-10-02
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Bin He , Michael Mantor , Jiasheng Chen
CPC classification number: G06F7/57 , G06F7/483 , G06F7/5443 , G06F9/3818 , G06F2207/3824
Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.
-
公开(公告)号:US11762658B2
公开(公告)日:2023-09-19
申请号:US16581252
申请日:2019-09-24
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Bin He , Michael Mantor , Jiasheng Chen , Jian Huang
CPC classification number: G06F9/30036 , G06F9/30101 , G06F9/3877 , G06F9/544 , G06F17/16
Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.
-
公开(公告)号:US20230289191A1
公开(公告)日:2023-09-14
申请号:US18128642
申请日:2023-03-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Sateesh LAGUDU , Allen H. Rush , Michael Mantor , Arun Vaidyanathan Ananthanarayan , Prasad Nagabhushanamgari , Maxim V. Kazakov
CPC classification number: G06F9/3887 , G06F13/28 , G06F13/4027
Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.
-
公开(公告)号:US11635967B2
公开(公告)日:2023-04-25
申请号:US17032307
申请日:2020-09-25
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Sateesh Lagudu , Allen H. Rush , Michael Mantor , Arun Vaidyanathan Ananthanarayan , Prasad Nagabhushanamgari , Maxim V. Kazakov
Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.
-
公开(公告)号:US11630667B2
公开(公告)日:2023-04-18
申请号:US16697660
申请日:2019-11-27
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Jiasheng Chen , Bin He , Jian Huang , Michael Mantor
Abstract: A processor includes a plurality of vector sub-processors (VSPs) and a plurality of memory banks dedicated to respective VSPs. A first memory bank corresponding to a first VSP includes a first plurality of high vector general purpose register (VGPR) banks and a first plurality of low VGPR banks corresponding to the first plurality of high VGPR banks. The first memory bank further includes a plurality of operand gathering components that store operands from respective high VGPR banks and low VGPR banks. The operand gathering components are assigned to individual threads while the threads are executed by the first VSP.
-
公开(公告)号:US11494192B2
公开(公告)日:2022-11-08
申请号:US16860842
申请日:2020-04-28
Inventor: Jiasheng Chen , YunXiao Zou , Bin He , Angel E. Socarras , QingCheng Wang , Wei Yuan , Michael Mantor
Abstract: A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.
-
公开(公告)号:US20220277508A1
公开(公告)日:2022-09-01
申请号:US17745410
申请日:2022-05-16
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Michael Mantor , Laurent Lefebvre , Mark Fowler , Timothy Kelley , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi
IPC: G06T15/00
Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.
-
公开(公告)号:US11386518B2
公开(公告)日:2022-07-12
申请号:US16580654
申请日:2019-09-24
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Michael Mantor , Alexander Fuad Ashkar , Randy Ramsey , Mangesh P. Nijasure , Brian Emberling
Abstract: The address of the draw or dispatch packet responsible for creating an exception is tied to a shader/wavefront back to the draw command from which it originated. In various embodiments, a method of operating a graphics pipeline and exception handling includes receiving, at a command processor of a graphics processing unit (GPU), an exception signal indicating an occurrence of a pipeline exception at a shader stage of a graphics pipeline. The shader stage generates an exception signal in response to a pipeline exception and transmits the exception signal to the command processor. The command processor determines, based on the exception signal, an address of a command packet responsible for the occurrence of the pipeline exception.
-
公开(公告)号:US11169811B2
公开(公告)日:2021-11-09
申请号:US16426613
申请日:2019-05-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Rex Eldon McCrary , Yi Luo , Harry J. Wise , Alexander Fuad Ashkar , Michael Mantor
IPC: G06F9/38 , G06T1/60 , G06T1/20 , G06F16/245
Abstract: A method of context bouncing includes receiving, at a command processor of a graphics processing unit (GPU), a conditional execute packet providing a hash identifier corresponding to an encapsulated state. The encapsulated state includes one or more context state packets following the conditional execute packet. A command packet following the encapsulated state is executed based at least in part on determining whether the hash identifier of the encapsulated state matches one of a plurality of hash identifiers of active context states currently stored at the GPU.
-
-
-
-
-
-
-
-
-