摘要:
Sourcing immediate values from a very long instruction word includes determining if a VLIW sub-instruction expansion condition exists. If the sub-instruction expansion condition exists, operation of a portion of a first arithmetic logic unit component is minimized. In addition, a part of a second arithmetic logic unit component is expanded by utilizing a block of a very long instruction word, which is normally utilized by the first arithmetic logic unit component, for the second arithmetic logic unit component if the sub-instruction expansion condition exists.
摘要:
An arithmetic logic stage in a graphics processor unit pipeline includes a number of arithmetic logic units (ALUs) and at least one buffer that stores pixel data for a group of pixels. Each clock cycle, the buffer stores one row of a series of rows of pixel data. A deserializer deserializes the rows of pixel data before the pixel data is placed in the buffer. After the buffer accumulates all rows of pixel data for a pixel, then the pixel data for the pixel can be operated on by the ALUs.
摘要:
An arithmetic logic stage in a graphics processor unit includes arithmetic logic units (ALUs) and global registers. The registers contain global values for a group of pixels. Global values may be read from any of the registers, regardless of which of the pixels is being operated on by the ALUs. However, when writing results of the ALU operations, only some of the global registers are candidates to be written to, depending on the pixel number. Accordingly, overwriting of data is prevented.
摘要:
Sourcing immediate values from a very long instruction word includes determining if a VLIW sub-instruction expansion condition exists. If the sub-instruction expansion condition exists, operation of a portion of a first arithmetic logic unit component is minimized. In addition, a part of a second arithmetic logic unit component is expanded by utilizing a block of a very long instruction word, which is normally utilized by the first arithmetic logic unit component, for the second arithmetic logic unit component if the sub-instruction expansion condition exists.
摘要:
An arithmetic logic stage in a graphics processor unit pipeline includes a number of arithmetic logic units (ALUs) and at least one buffer that stores pixel data for a group of pixels. Each clock cycle, the buffer stores one row of a series of rows of pixel data. A deserializer deserializes the rows of pixel data before the pixel data is placed in the buffer. After the buffer accumulates all rows of pixel data for a pixel, then the pixel data for the pixel can be operated on by the ALUs.
摘要:
An arithmetic logic stage in a graphics processor unit includes arithmetic logic units (ALUs) and global registers. The registers contain global values for a group of pixels. Global values may be read from any of the registers, regardless of which of the pixels is being operated on by the ALUs. However, when writing results of the ALU operations, only some of the global registers are candidates to be written to, depending on the pixel number. Accordingly, overwriting of data is prevented.
摘要:
An apparatus is disclosed. The apparatus comprises an instruction mapping table, which includes a plurality of instruction counts and a plurality of instruction pointers each corresponding with one of the instruction counts. Each instruction pointer identifies a next instruction for execution. Further, each instruction count specifies a number of instructions to execute beginning with the next instruction. The apparatus also has a data operation unit adapted to receive a data group and adapted to execute on the received data group the number of instructions specified by a current instruction count of the instruction mapping table beginning with the next instruction identified by a current instruction pointer of the instruction mapping table before proceeding with another data group.
摘要:
A method in system for latency buffered scoreboarding in a graphics pipeline of a graphics processor. The method includes receiving a graphics primitive for rasterization in a raster stage of a graphics processor and rasterizing the graphics primitive to generate a plurality pixels related to the graphics primitive. An ID stored to account for an initiation of parameter evaluation for each of the plurality of pixels as the pixels are transmitted to a subsequent stage of the graphics processor. A buffer is used to store the fragment data resulting from the parameter evaluation for each of the plurality of pixels by the subsequent stage. The ID and the fragment data from the buffering are compared to determine whether they correspond to one another. The completion of parameter evaluation for each of the plurality of pixels is accounted for when the ID and the fragment data match and as the fragment data is written to a memory.
摘要:
Vertex data can be accessed for a graphics primitive. The vertex data includes homogeneous coordinates for each vertex of the primitive. The homogeneous coordinates can be used to determine perspective-correct barycentric coordinates that are normalized by the area of the primitive. The normalized perspective-correct barycentric coordinates can be used to determine an interpolated value of an attribute for the pixel. These operations can be performed using adders and multipliers implemented in hardware.
摘要:
Image-based data, such as a block of texel data, is accessed. The data includes sets of color component values. A luminance value is computed for each set of color components values, generating a range of luminance values. A first set and a second set of color component values that correspond to the minimum and maximum luminance values are selected from the sets of color component values. A third set of color component values can be mapped to an index that identifies how the color component values of the third set can be decoded using the color component values of the first and second sets. The index value is selected by determining where the luminance value for the third set lies in the range of luminance values.