摘要:
In general, techniques are described for visibility-based state updates in graphical processing units (GPUs). A device that renders image data comprising a memory configured to store state data and a GPU may implement the techniques. The GPU may be configured to perform a multi-pass rendering process to render an image from the image data. The GPU determines visibility information for a plurality of objects defined by the image data during a first pass of the multi-pass rendering process. The visibility information indicates whether each of the plurality of objects will be visible in the image rendered from the image data during a second pass of the multi-pass rendering process. The GPU then retrieves the state data from the memory for use by the second pass of the multi-pass rendering process in rendering the plurality of objects of the image data based on the visibility information.
摘要:
This disclosure is directed to deferred preemption techniques for scheduling graphics processing unit (GPU) command streams for execution on a GPU. A host CPU is described that is configured to control a GPU to perform deferred-preemption scheduling. For example, a host CPU may select one or more locations in a GPU command stream as being one or more locations at which preemption is allowed to occur in response to receiving a preemption notification, and may place one or more tokens in the GPU command stream based on the selected one or more locations. The tokens may indicate to the GPU that preemption is allowed to occur at the selected one or more locations. This disclosure further describes a GPU configured to preempt execution of a GPU command stream based on one or more tokens placed in a GPU command stream.
摘要:
This disclosure describes techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values. The techniques may include retrieving a destination alpha value from a bin buffer, the destination alpha value being generated in response to processing a first pixel associated with a first primitive. The techniques may further include determining, based on the destination alpha value, whether to perform an action that causes one or more texture values for a second pixel to not be retrieved from a texture buffer. In some examples, the action may include discarding the second pixel from a pixel processing pipeline prior to the second pixel arriving at a texture mapping stage of the pixel processing pipeline. The second pixel may be associated with a second primitive different than the first primitive.
摘要:
This disclosure is directed to deferred preemption techniques for scheduling graphics processing unit (GPU) command streams for execution on a GPU. A host CPU is described that is configured to control a GPU to perform deferred-preemption scheduling. For example, a host CPU may select one or more locations in a GPU command stream as being one or more locations at which preemption is allowed to occur in response to receiving a preemption notification, and may place one or more tokens in the GPU command stream based on the selected one or more locations. The tokens may indicate to the GPU that preemption is allowed to occur at the selected one or more locations. This disclosure further describes a GPU configured to preempt execution of a GPU command stream based on one or more tokens placed in a GPU command stream.
摘要:
The example techniques described in this disclosure may be directed to synchronization between producer shaders and consumer shaders. For example, a graphics processing unit (GPU) may execute a producer shader to produce graphics data. After the completion of the production of graphics data, the producer shader may store a value indicative of the amount of produced graphics data. The GPU may execute one or more consumer shaders, after the storage of the value indicative of the amount of produced graphics data, to consume the produced graphics data.
摘要:
This disclosure describes techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values. The techniques may include retrieving a destination alpha value from a bin buffer, the destination alpha value being generated in response to processing a first pixel associated with a first primitive. The techniques may further include determining, based on the destination alpha value, whether to perform an action that causes one or more texture values for a second pixel to not be retrieved from a texture buffer. In some examples, the action may include discarding the second pixel from a pixel processing pipeline prior to the second pixel arriving at a texture mapping stage of the pixel processing pipeline. The second pixel may be associated with a second primitive different than the first primitive.
摘要:
The example techniques described in this disclosure may be directed to synchronization between producer shaders and consumer shaders. For example, a graphics processing unit (GPU) may execute a producer shader to produce graphics data. After the completion of the production of graphics data, the producer shader may store a value indicative of the amount of produced graphics data. The GPU may execute one or more consumer shaders, after the storage of the value indicative of the amount of produced graphics data, to consume the produced graphics data.
摘要:
This disclosure describes techniques for extending the architecture of a general purpose graphics processing unit (GPGPU) with parallel processing units to allow efficient processing of pipeline-based applications. The techniques include configuring local memory buffers connected to parallel processing units operating as stages of a processing pipeline to hold data for transfer between the parallel processing units. The local memory buffers allow on-chip, low-power, direct data transfer between the parallel processing units. The local memory buffers may include hardware-based data flow control mechanisms to enable transfer of data between the parallel processing units. In this way, data may be passed directly from one parallel processing unit to the next parallel processing unit in the processing pipeline via the local memory buffers, in effect transforming the parallel processing units into a series of pipeline stages.