Abstract:
A graphics processing unit 2 includes a texture pipeline 6 which performs filter operations upon texture values. If the texture values are integer texture values, then they may be processed by the texture pipeline in a variable order corresponding to the order in which they are retrieved from a memory 4. If the texture values are floating point texture values, then they are processed in a fixed order in order to ensure result invariants as the filter operation is non-associative for floating point values. The filter operation is not commenced until all of the floating point texture values have been retrieved from the memory 4 and other available for processing.
Abstract:
When encoding a texture map 1 for use in graphics processing, the texture map is divided into a plurality of equal-sized blocks 2 of texture data elements. Each block 2 of texture data elements is then encoded as a block of texture data 5 that includes a set of integer values to be used to generate a set of base data values for the block, and a set of index values indicating how to use the base data values to generate data values for the texture data elements that the block represents. The integer values and the index values are both encoded in an encoded texture data block using a combination of base-n values, where n is greater than two, and base-2 values. Predefined bit representations are used to represent plural base-n values (n>2) collectively, and the bits of the bit representations representing the base-n values (n>2) are interleaved with bits representing the base-2 values in the encoded texture data block.
Abstract:
When processing a set of tiles to generate an output in a tile based graphics processing pipeline, the pipeline, for one or more tiles of the set of tiles, renders one or more render targets containing data to be used in a processing operation (602), and stores the render targets in the tile buffer (604). It also stores some but not all of the sampling position values for a render target or targets for use when processing an adjacent tile of the set of tiles (606). It then performs a processing operation for the tile using the stored render target or targets (608) and one or more stored sampling position values from another, adjacent tile of the set of tiles (610), to generate an output for the tile (612).
Abstract:
A data processing system includes one or more processors 4, 5, 6, 7 operable to initiate atomic memory requests for execution threads and plural data caches 8, 9, 10, 11 that are used to store data needed to perform an atomic memory operation when an atomic memory operation is to be performed.When atomic operations are to be performed against a data cache, the results of atomic operations that are to access the same memory location are accumulated in a temporary cache line in the data cache pending the arrival in the cache of the “true” cache line from memory. The accumulated results of the atomic operations stored in the temporary cache line are then combined with the cache line from memory when the cache line arrives in the cache. Individual atomic values can also be reconstructed once the cache line arrives at the cache.
Abstract:
When an atomic operation is to be executed for a thread group by an execution stage of a data processing system, it is determined whether there is a set of threads for which the atomic operation for the threads accesses the same memory location. If so, the arithmetic operation for the atomic operation is performed for the first thread in the set of threads using an identity value for the arithmetic operation for the atomic operation and the first thread's register value for the atomic operation, and is performed for each other thread in the set of threads using the thread's register value for the atomic operation and the result of the arithmetic operation for the preceding thread in the set of threads, to thereby generate for the final thread in the identified set of threads a combined result of the arithmetic operation for the set of threads.
Abstract:
A method of operating a graphics processor that is configured to execute a graphics processing pipeline is provided. The method comprises the graphics processor reading, from an index buffer in external memory, a block of data comprising plural sets of indices, each set of indices comprising a sequence of indices indexing a set of vertices that defines a primitive of a plurality of primitives to be processed by the graphics processing pipeline. The graphics processor compresses the block of data to form a compressed version of the block of data, and stores the compressed version of the block of data in an internal memory of the graphics processor.
Abstract:
A graphics processing pipeline comprises vertex shading circuitry that operates to vertex shade position attributes of vertices of a set of vertices to be processed by the graphics processing pipeline, to generate, inter alia, a separate vertex shaded position attribute value for each view of the plural different views. Tiling circuitry then determines for the vertices that have been subjected to the first vertex shading operation, whether the vertices should be processed further. Vertex shading circuitry then performs a second vertex shading operation on the vertices that it has been determined should be processed further, to vertex shade the remaining vertex attributes for each vertex that it has been determined should be processed further, to generate, inter alia, a single vertex shaded attribute value for the set of plural views.
Abstract:
A programmable execution unit (42) of a graphics processor includes a functional unit (50) that is operable to execute instructions (51). The output of the functional unit (50) can both be written to a register file (46) and fed back directly as an input to the functional unit by means of a feedback circuit (52). Correspondingly, an instruction that is to be executed by the functional unit (50) can select as its inputs either the fed-back output (52) from the execution of the previous instruction, or inputs from the registers (46). A register access descriptor (54) between each instruction in a group of instructions (53) specifies the registers whose values will be available on the register ports that the functional unit will read when executing the instruction, and the register address where the result of the execution of the instruction will be written to. The programmable execution unit (42) executes group of instructions (53) that are to be executed atomically.
Abstract:
A data processing system determines for a stream of instructions to be executed, whether there are any instructions that can be re-ordered in the instruction stream 41 and assigns each such instruction to an instruction completion tracker and includes in the encoding for the instruction an indication of the instruction completion tracker it has been assigned to 42. For each instruction in the instruction stream, an indication of which instruction completion trackers, if any, the instruction depends on is also provided 43, 44. Then, when an instruction that is indicated as being dependent on an instruction completion tracker is to be executed, the status of the relevant instruction completion tracker is checked before executing the instruction.
Abstract:
A method of operating a data processing system comprises maintaining record of a set of processing passes to be performed by processing pass circuitry of the data processing system. The method comprises performing cycles of operation in which it is considered whether or not the data required for a subset of processing passes is stored in a local cache. The subset of processing passes that is considered in a subsequent scan of the record comprises at least one processing pass that was not considered in the previous scan of the record, regardless of whether or not the data considered in the previous scan is determined as being stored in the cache. The method provides an efficient way to identify processing passes that are ready to be performed.