摘要:
A graphics processor and method for executing a graphics program as a plurality of threads where each sample to be processed by the program is assigned to a thread. Although threads share processing resources within the programmable graphics processor, the execution of each thread can proceed independent of any other threads. For example, instructions in a second thread are scheduled for execution while execution of instructions in a first thread are stalled waiting for source data. Consequently, a first received sample (assigned to the first thread) may be processed after a second received sample (assigned to the second thread). A benefit of independently executing each thread is improved performance because a stalled thread does not prevent the execution of other threads.
摘要:
A graphics pipeline system and associated method are provided with an integrated clipping operation. First included is a transform module positioned on a single semiconductor platform for transforming graphics data from a first space to a second space. Also provided is a lighting module positioned on the same single semiconductor platform as the transform module. The lighting module is adapted for performing lighting operations on the graphics data. A clipping operation is also performed utilizing the single semiconductor platform.
摘要:
A graphics pipeline system is provided with an integrated scissor operation. First provided is a transform module adapted for being coupled to a buffer to receive graphics data therefrom. Such transform module is positioned on a single semiconductor platform for transforming the graphics data from a first space to a second space. Associated therewith is a lighting module coupled to the transform module and positioned on the same single semiconductor platform as the transform module for performing lighting operations on the graphics data received from the transform module. A scissor operation is performed on the same single semiconductor platform as the transform module and the lighting module.
摘要:
A method, apparatus and article of manufacture are provided for sequencing graphics processing in a transform or lighting operation. A plurality of mode bits are first received which are indicative of the status of a plurality of modes of process operations. A plurality of addresses are then identified in memory based on the mode bits. Such addresses are then accessed in the memory for retrieving code segments which each are adapted to carry out the process operations in accordance with the status of the modes. The code segments are subsequently executed within a transform or lighting module for processing vertex data.
摘要:
A method, apparatus and article of manufacture are provided for managing vertex data in a vertex buffer. First, vertex data is received and stored in the vertex buffer. Thereafter, the vertex data is outputted from the vertex buffer to a processing module. During operation, a plurality of command bits is passed from the vertex buffer for determining a manner in which the vertex data is inputted and processed in the input buffer of the processing module. Such command bits are received from a command bit source. Further, a plurality of mode bits indicative of a status of a plurality of modes of process operations is passed. Such mode bits are received from a mode bit source. The mode bits are adapted for determining a manner in which the vertex data is processed in the processing module.
摘要:
A graphics pipeline system is provided for graphics processing. Such system includes a transform module adapted for being coupled to a vertex attribute buffer for receiving vertex data. The transform module serves to transform the vertex data from object space to screen space. Coupled to the transform module is a lighting module which is positioned on the single semiconductor platform for performing lighting operations on the vertex data received from the transform module. Also included is a rasterizer coupled to the lighting module and positioned on the single semiconductor platform for rendering the vertex data received from the lighting module.
摘要:
An embodiment of a computing system is configured to process data using a multithreaded SIMD architecture that includes heterogeneous processing engines to execute a program. The program is constructed of various program instructions. A first type of the program instructions can only be executed by a first type of processing engine and a second type of program instructions can only be executed by a second type of processing engine. A third type of program instructions can be executed by the first and the second type of processing engines. An instruction dispatcher is configured to identify and remove program instruction execution conflicts for the heterogeneous processing engines to improve instruction execution throughput.
摘要:
One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.
摘要:
A system, method and computer program product are provided for bump mapping in a hardware graphics processor. Initially, a first set of texture coordinates is received. The texture coordinates are then multiplied by a matrix to generate results. A second set of texture coordinates is then offset utilizing the results. The offset second set of texture coordinates is then mapped to color.
摘要:
A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. A crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions.