-
公开(公告)号:US11880926B2
公开(公告)日:2024-01-23
申请号:US17745410
申请日:2022-05-16
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Michael Mantor , Laurent Lefebvre , Mark Fowler , Timothy Kelley , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi
IPC: G06T15/00
CPC classification number: G06T15/005
Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.
-
公开(公告)号:US11726868B2
公开(公告)日:2023-08-15
申请号:US17113815
申请日:2020-12-07
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Michael Mantor , Sudhanva Gurumurthi
IPC: G06F11/10 , G06F11/16 , G06F12/0866 , G06F11/00 , H03M13/00
CPC classification number: G06F11/1064 , G06F11/1629 , G06F11/1641 , G06F11/1654 , G06F12/0866 , G06F2212/1032 , G06F2212/281 , G06F2212/403
Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
-
公开(公告)号:US20230097279A1
公开(公告)日:2023-03-30
申请号:US17489734
申请日:2021-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Brian Emberling , Michael Mantor , Michael Y. Chow , Bin He
Abstract: Methods and systems are disclosed for executing operations on single-instruction-multiple-data (SIMD) units. Techniques disclosed perform a dot product operation on input data during one computer cycle, including convolving the input data, generating intermediate data, and applying one or more transitional operations to the intermediate data to generate output data. Aspects described, wherein the input data is an input to a layer of a convolutional neural network and the generated output data is the output of the layer.
-
公开(公告)号:US20220269620A1
公开(公告)日:2022-08-25
申请号:US17666974
申请日:2022-02-08
Applicant: ADVANCED MICRO DEVICES, INC. , ATI Technologies ULC
Inventor: Benjamin T. SANDER , Mark Fowler , Anthony Asaro , Gongxian Jeffrey Cheng , Michael Mantor
IPC: G06F12/1027 , G06F12/0893
Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.
-
公开(公告)号:US11409536B2
公开(公告)日:2022-08-09
申请号:US15342809
申请日:2016-11-03
Applicant: Advanced Micro Devices, Inc.
Inventor: Bin He , YunXiao Zou , Jiasheng Chen , Michael Mantor
Abstract: A method and apparatus for performing a multi-precision computation in a plurality of arithmetic logic units (ALUs) includes pairing a first Single Instruction/Multiple Data (SIMD) block channel device with a second SIMD block channel device to create a first block pair having one-level staggering between the first and second channel devices. A third SIMD block channel device is paired with a fourth SIMD block channel device to create a second block pair having one-level staggering between the third and fourth channel devices. A plurality of source inputs are received at the first block pair and the second block pair. The first block pair computes a first result, and the second block pair computes a second result.
-
公开(公告)号:US20220197655A1
公开(公告)日:2022-06-23
申请号:US17548105
申请日:2021-12-10
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Sateesh LAGUDU , Arun Vaidyanathan ANANTHANARAYAN , Michael Mantor , Allen H. Rush
Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.
-
公开(公告)号:US11335052B2
公开(公告)日:2022-05-17
申请号:US16179376
申请日:2018-11-02
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Michael Mantor , Laurent Lefebvre , Mark Fowler , Timothy Kelley , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi
IPC: G06T15/00
Abstract: A system, method and a non-transitory computer readable storage medium are provided for hybrid rendering with deferred primitive batch binning. A primitive batch is generated from one or more primitives. A bin is identified for processing the primitive batch. At least a portion of each primitive intersecting the identified bin is processed and a next bin for processing the primitive batch is identified based on an intercept walk order. The processing is iteratively repeated for the one or more primitives in the primitive batch for successive bins until all primitives of the primitive batch are completely processed. Then, the one or more primitives in the primitive batch are further processed.
-
公开(公告)号:US10922868B2
公开(公告)日:2021-02-16
申请号:US16452831
申请日:2019-06-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Mangesh P. Nijasure , Todd Martin , Michael Mantor
Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.
-
公开(公告)号:US10585801B2
公开(公告)日:2020-03-10
申请号:US13685133
申请日:2012-11-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Nuwan S. Jayasena , James Michael O'Connor , Michael Mantor
IPC: G06F12/08 , G06F12/0862 , G06F9/52 , G06F8/41
Abstract: Embodiments include methods, systems and computer readable media configured to execute a first kernel (e.g. compute or graphics kernel) with reduced intermediate state storage resource requirements. These include executing a first and second (e.g. prefetch) kernel on a data-parallel processor, such that the second kernel begins executing before the first kernel. The second kernel performs memory operations that are based upon at least a subset of memory operations in the first kernel.
-
公开(公告)号:US10372522B2
公开(公告)日:2019-08-06
申请号:US15582443
申请日:2017-04-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Carlos Sampayo , Michael Mantor
Abstract: Techniques for handling memory errors are disclosed. Various memory units of an accelerated processing device (“APD”) include error units for detecting errors in data stored in the memory (e.g., using parity protection or error correcting code). Upon detecting an error considered to be an “initial uncorrectable error,” the error unit triggers transmission of an initial uncorrectable error interrupt (“IUE interrupt”) to a processor. This IUE interrupt includes information identifying the specific memory unit in which the error occurred (and possible other information about the error). A halt interrupt is generated and transmitted to the processor in response to the data having the error being consumed (i.e., used by an operation such as an instruction or command), which causes the APD to halt operations. If the data having the error is not consumed, then the halt interrupt is never generated (that the error occurred may remain logged, however).
-
-
-
-
-
-
-
-
-