摘要:
A system, method and computer program product are provided for performing depth testing and blending operations in a first mode and a second mode. In the first mode, a circuit processes a first number (m) of first pixels per clock cycle, each of the first pixels including both color values and depth values. In the second mode, the circuit processes a second number (n) of second pixels per clock cycle. Each of the second pixels includes the depth values and not the color values. Further, the second number (n) is greater than the first number (m).
摘要:
A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks. Each shader pipeline has a shader gatekeeper that interacts with the shader distributor and with the shader instruction processor such that pixel data that passes through the shader pipelines is controlled and processed as required.
摘要:
A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks. Each shader pipeline has a shader gatekeeper that interacts with the shader distributor and with the shader instruction processor such that pixel data that passes through the shader pipelines is controlled and processed as required.
摘要:
A system and method for compressing and decompressing a texture image that: (1) compresses each texel to 8 bits, and when decompressed, each texel is of a quality comparable to a 256 color palettized image; (2) increases the efficiency of the decompression system and method by eliminating complex operations, e.g., multiplication; and (3) increases the efficiency of the system and method when switching between textures that use different palettes, when compared to conventional system and methods. The invention compresses a texture image, stores the compressed texture image, and quickly and efficiently decompresses the texture image when determining a value of a pixel. The texture image compression technique utilizes a palletized color space that more closely matches the colors in the texture image while allocating an unequal number of bits to the color channels. Each texel in the texture image is converted to an 8-bit value in the selected color space, and a decompression table is generated that represents the RGB values for the each texel stored in the selected color space. In order to map the texture image to the object, one or more texels that are associated with each pixel are decompressed. The present invention quickly and efficiently decompresses each texel using a hardware decompression unit. The decompression unit does not perform any multiplication operations.
摘要:
A system and method are provided for a dedicated hardware-implemented viewport operation in a graphics pipeline. Included is a transform/lighting module for transforming and lighting vertex data. Also provided is viewport hardware coupled to the transform/lighting module for performing a viewport operation on the vertex data. A rasterizer is coupled to the viewport hardware for rendering the vertex data.
摘要:
Circuits, methods, and apparatus that provide the die area and power savings of a single-ported memory with the performance advantages of a multiported memory. One example provides register allocation methods for storing data in a multiple-bank register file. In a thin register allocation method, data for a process is stored in a single bank. In this way, different processes use different banks to avoid conflicts. In a fat register allocation method, processes store data in each bank. In this way, if one process uses a large number of registers, those registers are spread among the banks, avoiding a situation where one bank is filled and other processes are forced to share a reduced number of banks. In a hybrid register allocation method, processes store data in more than one bank, but fewer than all the banks. Each of these methods may be combined in varying ways.
摘要:
A texture compositing apparatus and method for combining multiple independent texture colors in a variety of ways in a single execution pass using a single texture compositing unit (TCU) per texture. The TCU receives a control signal, a blend factor, a local data signal(C.sub.local /A.sub.local) and an output data signal (C.sub.in /A.sub.in) generated by another TCU, the local data signal and the output data signal represent a texture color in a RGBA format. Based upon the control signal, the TCU can generate an output signal based on a variety of functions. The outputs that can be generated include but are not limited to: (1) zero; (2) one; (3) C.sub.in ; (4) C.sub.local ; (5) C.sub.in +C.sub.local ; (6) C.sub.in -C.sub.local ; (7) C.sub.in *C.sub.local ; (8) C.sub.in *C.sub.local +A.sub.local ; (9) C.sub.in *A.sub.local +C.sub.local ; (10) (C.sub.in -C.sub.local)* F.sub.blend +C.sub.local ; and (11) (C.sub.in -C.sub.local)*(1-F.sub.blend)+C.sub.local. Another feature of the invention is that multiple TCUs can be serially coupled to enable additional texture colors to be combined in a single execution path.
摘要:
A texture compositing apparatus and method for combining multiple independent texture colors in a variety of ways in a single execution pass using a single texture compositing unit (TCU) per texture. The TCU receives a control signal, a blend factor, a local data signal(C.sub.local /A.sub.local), and an output data signal (C.sub.in /A.sub.in) generated by another TCU, the local data signal and the output data signal represent a texture color in a RGBA format. Based upon the control signal, the TCU can generate an output signal based on a variety of functions. The outputs that can be generated include but are not limited to: (1) zero; (2) one; (3) C.sub.in ; (4) C.sub.local ; (5) C.sub.in +C.sub.local ; (6) C.sub.in -C.sub.local ; (7) C.sub.in *C.sub.local ; (8) C.sub.in *C.sub.local +A.sub.local ; (9) C.sub.in *A.sub.local +C.sub.local ; (10) (C.sub.in -C.sub.local)*F.sub.blend +C.sub.local ; and (11) (C.sub.in -C.sub.local)*(1-F.sub.blend)+C.sub.local. Another feature of the invention is that multiple TCUs can be serially coupled to enable addition texture colors to be combined in a single execution path.
摘要:
A system for generating blend values for three-dimensional graphic rendering includes a first register, a second register, third register, an index creation unit, a blend value generation unit and a blending unit. The first register receives and stores color pixel data, and the second register receives and stores a depth perspective component; and the third register receives and stores fog color data. The output of the second register is coupled to the index creation unit which uses the received depth perspective component to generate a two-part index. The two-part index is output by the index creation unit to produce a blend value. The first portion of the index is used to address a table in the blend generation unit, and the second portion of the index is used to produce an increment value added to output of the table resulting in the creation of a blend value. The blend value, the color pixel data and the fog color data are then blended by the blending unit and output by the system. The invention also includes a method for generating a blend value and producing a blended color output. The method includes the steps of: producing an index having a first portion and a second portion from a distance value; determining a base value using the first portion of the index, determining a delta between the base value and the next entry in the table using the first portion of the index, determining a blend increment using the delta and the second portion of the index; and producing a blend value by adding the base value to the blend increment; and blending the blend value with input pixel data.
摘要:
A system and method for enabling a graphics processor to operate with a CPU that reorders write instructions without requiring expensive hardware and which does not significantly reduce the performance of the driver operating on the CPU. The invention allows the graphics processor to evaluate the data sent to it by software running on the CPU in its intended and proper order, even if the CPU transmits the data to the graphics processor in an order different from that generated by the software. The invention works regardless of the particular write reordering technique used by the CPU, and is a very low-cost addition to the graphics processor, requiring only a few registers and a small state machine. The invention identifies the number of "holes" in the reordered write instructions and when the number of holes becomes zero a set of received data is made available for execution by the graphics processor.