Abstract:
A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.
Abstract:
A Streaming Wave Coalescer (SWC) circuit stores a first set of state values associated with a first subset of threads of a first wave in a bin based on each of the first subset of threads including a first set of instructions to be executed. A second set of state values associated with a second subset of threads of a second wave is stored in the bin based on each of the second subset of threads including the first set of instructions to be executed and based on the first wave and the second wave both being associated with a hard key. A third wave is formed from the threads of the first subset and the second subset and is emitted for execution. As a result of reorganizing the threads and reconstituting a different wave, thread divergence of waves sent for execution is reduced.
Abstract:
A technique for performing ray tracing operations is provided. The technique includes determining error bounds for a rotation operation for a ray; selecting a technique for determining whether the ray intersects a bounding box based on the error bounds; and determining whether the ray hits the bounding box based on the selected technique.
Abstract:
A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.
Abstract:
Apparatuses, computer readable mediums, and methods of building a k-dimensional tree (kd-tree) are disclosed. The method may include a first processor, for example a graphics processing unit (GPU), selecting a node to split in a depth first manner. The method may include the GPU splitting based on a split plane a node into a left node and a right node. The GPU may assign the left (right) node to the GPU when a number of polygons associated with the left (right) node is above a threshold and otherwise assign the left node to a second processor, for example a central processing unit (CPU). The CPU may build the kd-tree in a depth first manner. The GPU (CPU) may select a next node to split based on a last node assigned to the GPU (CPU) or by selecting a node that is currently in a local memory of the GPU (CPU).