Abstract:
Example techniques are described for generating graphics content by obtaining texture operation instructions corresponding to a texture operation, in response to determining at least one of insufficient general purpose register space is available for the texture operation or insufficient wave slots are available for the texture operation, generating an indication that the texture operation corresponds to a deferred wave, executing the texture operation, sending, to a texture processor, initial texture sample instructions corresponding to the texture operation that was executed, and receiving texture mapped data corresponding to the initial texture sample instructions.
Abstract:
This disclosure describes examples of using two vertex shaders each one during different graphics processing passes in a binning architecture for graphics processing. A first vertex shader processes subset of attributes of a vertex in a binning pass, where the subset of attributes include those that contribute to visibility determination and attributes that may benefit from being processed with a vertex shader that provides functional flexibility. A second, different vertex shader processes another subset of attributes of the vertex in the rendering pass.
Abstract:
A method for processing data in a graphics processing unit (GPU) including receiving an instance identifier for an instance and a shader program comprising a preamble code block and a main shader code block, assigning, the instance identifier to a general purpose register at wave creation, allocating address space within the constant memory for instance uniforms, and determining the preamble code block has not been executed and the wave is a first wave of the instance to be executed, based on determining the preamble code block has not been executed and the wave is the first wave to be executed, executing the preamble code block to store the plurality of instance uniforms in the constant memory and based, at least in part, on executing the preamble code block, executing the wave of the plurality of waves using at least one of the plurality of instance constants stored inconstant memory.
Abstract:
A method for processing data in a graphics processing unit including receiving a code block of instructions common to a plurality of groups of threads of a shader, executing the code block of instructions common to the plurality of groups of threads of the shader creating a result by a group of threads of the plurality of groups of threads, storing the result of the code block of instructions common to the plurality of groups of threads of the shader in on-chip random access memory (RAM) accessible by each of the plurality of groups of threads, and upon a determination that storing the result of the code block of instructions common to the plurality of groups of threads of the shader has completed, returning the result of the code block of instructions common to the plurality of groups of threads of the shader from on-chip RAM.
Abstract:
This disclosure describes an apparatus configured to process graphics data. The apparatus may include a fixed hardware pipeline configured to execute one or more functions on a current set of graphics data. The fixed hardware pipeline may include a plurality of stages including a bypassable portion of the plurality of stages. The apparatus may further include a shortcut circuit configured to route the current set of graphics data around the bypassable portion of the plurality of stages, and a controller positioned before the bypassable portion of the plurality of stages, the controller configured to selectively route the current set of graphics data to one of the shortcut circuit or the bypassable portion of the plurality of stages.
Abstract:
Techniques are described for determining whether data of a variable for each of a plurality of graphics items is same. If determined that the data is the same, the techniques store the data in a storage location of a specialized shared general purpose register that is associated with the variable.
Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for configurable aprons for expanded binning. Aspects of the present disclosure include identifying one or more pixel tiles in at least one bin and determining edge information for each pixel tile of the one or more pixel tiles. The edge information may be associated with one or more pixels adjacent to each pixel tile. The present disclosure further describes determining whether at least one adjacent bin is visible based on the edge information for each pixel tile, where the at least one adjacent bin may be adjacent to the at least one bin.
Abstract:
The present disclosure relates to methods and apparatus for graphics processing. An example method generally includes receiving, at a graphics processing unit (GPU), a plurality of commands corresponding to a plurality of draws across a frame, each of the plurality of commands indicating a depth test direction with respect to a low-resolution depth (LRZ) buffer for the corresponding draw. The method generally includes maintaining, at the GPU, a LRZ status buffer to store a corresponding depth test direction for a first command in time of the plurality of commands processed by the GPU. The method generally includes disabling, at the GPU, use of the LRZ buffer for depth testing for any of the plurality of commands remaining unprocessed after processing a command of the plurality of commands having a different depth test direction than the corresponding depth test direction stored in the LRZ status buffer.
Abstract:
The present disclosure relates to methods and apparatus for hybrid rendering of video/graphics content by a graphics processing unit. The apparatus can configure the graphics processing unit of a display apparatus to perform multiple rendering passes for a frame of a scene to be displayed on a display device. Moreover, the apparatus can control the graphics processing unit to perform a first rendering pass of the multiple rendering passes to generate a first render target that is stored in either an on-chip graphics memory of the GPU or a system of the display apparatus. The apparatus can also control the graphics processing unit to perform a second rendering pass to generate a second render target that is alternatively stored in the system memory of the display apparatus or on-chip graphics memory of the GPU.
Abstract:
A graphics processing unit (GPU) may determine a workload of a fragment shader program that executes on the GPU. The GPU may compare the workload of the fragment shader program to a threshold. In response to determining that the workload of the fragment shader program is lower than a specified threshold, the fragment shader program may process one or more fragments without the GPU performing early depth testing of the one or more fragments before the processing by the fragment shader program. The GPU may perform, after processing by the fragment shader program, late depth testing of the one or more fragments to result in one or more non-occluded fragments. The GPU may write pixel values for the one or more non-occluded fragments into a frame buffer.