Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for fast incremental shared constants. In aspects, a CPU may determine/update shared constant data for a first draw call of a plurality of draw calls. The shared constant data, which may correspond to at least one shader, may be updated based on a draw call update for the first draw call. The CPU may communicate the updated shared constant data for the first draw call to a GPU. The GPU may receive, in at least one register, the updated shared constant data from the CPU and configure the at least one register based on the updated shared constant data corresponding to the draw call update of the first draw call of the plurality of draw calls.
Abstract:
The present disclosure relates to methods and apparatus for graphics processing. Aspects of the present disclosure can determine a portion of a display area, where the portion of the display area is determined based on display content of the display area. Further, aspects of the present disclosure can communicate display information corresponding to the determined portion of the display area. Additionally, aspects of the present disclosure can update the display information corresponding to the determined portion of the display area. Aspects of the present disclosure can also communicate the updated display information corresponding to the determined portion of the display area. Aspects of the present disclosure can also render at least some display content of the display area corresponding to the determined portion of the display area. In some aspects, the updated display information can be based on the rendered display content of the display area.
Abstract:
The techniques of this disclosure include deferred batching of incremental constant loads. Graphics APIs include the ability to use lightweight constants for use by shaders. A buffer is allocated by a graphics processing unit (GPU) driver that contains a snapshot of the current lightweight constants. This may provide a complete set of state to serve as a starting point. From then on updates to the lightweight constants may be appended to this buffer in an incremental fashion by inserting the update and increasing the size of the buffer by a command processor on a graphics processing unit (GPU). The incremental nature of the updates may be captured, but removes the need for issuing them on every draw call and instead the incremental updates may be batch processed when a live draw call is encountered.
Abstract:
A graphics processing unit (GPU) is configured to access a first memory unit according to one of an unsecure mode and a secure mode. The GPU may include a memory access controller configured to allow the GPU to read data from only an unsecure portion of the first memory unit when the GPU is in the unsecure mode, and configured to allow the GPU to write data only to a secure portion of the first memory unit when the GPU is in the secure mode.
Abstract:
This disclosure proposes techniques for graphics processing. In one example, a graphics processing unit (GPU) is configured to access a first memory unit according to one of an unsecure mode and a secure mode. The GPU comprises a memory access controller configured to allow the GPU to read data from only an unsecure portion of the first memory unit when the GPU is in the unsecure mode, and configured to allow the GPU to write data only to a secure portion of the first memory unit when the GPU is in the secure mode.
Abstract:
The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may process a first workload of a plurality of workloads at each of multiple clusters in a GPU pipeline. The apparatus may also increment a plurality of performance counters during the processing of the first workload at each of the multiple clusters. Further, the apparatus may determine, at each of the multiple clusters, whether the first workload is finished processing. The apparatus may also read, upon determining that the first workload is finished processing, a value of each of the multiple clusters for each of the plurality of performance counters. Additionally, the apparatus may transmit an indication of the read value of each of the multiple clusters for all of the plurality of performance counters.
Abstract:
Methods, systems, and devices for graphics processer unit (GPU) operations are described. A device may monitor one or more states of a GPU during a duration. Based on monitoring the one or more GPU states, the device may determine an execution of a GPU command that is common to at least two GPU operations for clearing the GPU buffer. The device may determine whether the GPU clear command has previously been executed during a duration or a GPU cycle in which the device monitored the GPU states. The device may process the GPU clear command based on the determination of whether the GPU clear command has previously been executed. For example, the device may drop the GPU clear command based on the determination or modify a portion of the GPU clear command and execute at least the modified portion of the GPU clear command.
Abstract:
The present disclosure provides for systems and methods to process a non-resident page that may include attempting to access the non-resident page, an address for the non-resident page pointing to a memory page containing default values, determining that the non-resident page should not cause a page fault based on an indicator indicating that a particular non-resident page should not generate a page fault, returning an indication that a memory read did not translate and returning the default value when the access of the non-resident page is a read and the non-resident page should not cause a page fault. Another example may discontinue a write when the access of the non-resident page is a write and the non-resident page should not cause a page fault.
Abstract:
The present disclosure relates to methods and apparatus for graphics processing. For example, disclosed techniques facilitate improving bindless state processing at a graphics processor. Aspects of the present disclosure can receive, at a graphics processor, a shader program including a preamble section and a main instructions section. Aspects of the present disclosure can also execute, with a scalar processor dedicated to processing preamble sections, instructions of the preamble section to implement a bindless mechanism for loading constant data associated with the shader program. Additionally, aspects of the present disclosure can distribute the main instructions section and the constant data to a streaming processor for executing the shader program.
Abstract:
Methods, systems, and devices for graphics processer unit (GPU) operations are described. A device may monitor one or more states of a GPU during a duration. Based on monitoring the one or more GPU states, the device may determine an execution of a GPU command that is common to at least two GPU operations for clearing the GPU buffer. The device may determine whether the GPU clear command has previously been executed during a duration or a GPU cycle in which the device monitored the GPU states. The device may process the GPU clear command based on the determination of whether the GPU clear command has previously been executed. For example, the device may drop the GPU clear command based on the determination or modify a portion of the GPU clear command and execute at least the modified portion of the GPU clear command.