摘要:
A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks. Each shader pipeline has a shader gatekeeper that interacts with the shader distributor and with the shader instruction processor such that pixel data that passes through the shader pipelines is controlled and processed as required.
摘要:
A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks. Each shader pipeline has a shader gatekeeper that interacts with the shader distributor and with the shader instruction processor such that pixel data that passes through the shader pipelines is controlled and processed as required.
摘要:
A new, useful, and non-obvious shader processor architecture having a shader register file that acts both as an internal storage register file for temporarily storing data within the shader processor and as a First-In First-Out (FIFO) buffer for a subsequent module. Some embodiments include automatic, programmable hardware conversion between numeric formats, for example, between floating point data and fixed point data.
摘要:
A fragment processor includes a fragment shader distributor, a fragment shader collector, and a plurality of fragment shader pipelines. Each fragment shader pipeline executes a fragment shader program on a segment of fragments. The plurality of fragment shader pipelines operate in parallel, executing the same or different fragment shader programs. The fragment shader distributor receives a stream of fragments from a rasterization unit and dispatches a portion of the stream of fragments to a selected fragment shader pipeline until the capacity of the selected fragment shader pipeline is reached. The fragment shader distributor then selects another fragment shader pipeline. The capacity of each of the fragment shader pipelines is limited by several different resources. As the fragment shader distributor dispatches fragments, it tracks the remaining available resources of the selected fragment shader pipeline. A fragment shader collector retrieves processed fragments from the plurality of fragment shader pipelines.
摘要:
A pixel center position that is not covered by a primitive covering a portion of the pixel is displaced to lie within a fragment formed by the intersection of the primitive and the pixel. X,y coordinates of a pixel center are adjusted to displace the pixel center position to lie within the fragment, affecting actual texture map coordinates or barycentric weights. Alternatively, a centroid sub-pixel sample position is determined based on coverage data for the pixel and a multisample mode. The centroid sub-pixel sample position is used to compute pixel or sub-pixel parameters for the fragment.
摘要:
A pixel center position that is not covered by a primitive covering a portion of the pixel is displaced to lie within a fragment formed by the intersection of the primitive and the pixel. X,y coordinates of a pixel center are adjusted to displace the pixel center position to lie within the fragment, affecting actual texture map coordinates or barycentric weights. Alternatively, a centroid sub-pixel sample position is determined based on coverage data for the pixel and a multisample mode. The centroid sub-pixel sample position is used to compute pixel or sub-pixel parameters for the fragment.
摘要:
A system and method controls the scheduling of program instructions included in a shader program for execution by a processing pipeline. One or more fence instructions may be inserted into the shader program. Each fence instruction specifies a constraint that is applied to control the scheduling of another program instruction in the shader program. Controlling the scheduling of program instructions for execution by the processing pipeline may result in a more efficient use of computing resources and improved performance.
摘要:
Methods and systems for texture mapping in a computer-implemented graphics pipeline are described. A sample group is identified as including a divergent pixel. A determination is made whether an operand of an instruction executing on the divergent pixel satisfies a condition. A scheme for determining a level of detail for the texture mapping is selected depending on whether or not the condition is satisfied.
摘要:
A method and apparatus of operating a shader having multiple texture or shader processing stations. That method includes feeding the output of a texture or shader processing station directly into the input of another texture or shader processing station. Further, only a subset of the processing stations has access to a shader register file.
摘要:
A method of optimizing perspective correction computations to be executed in a programmable fragment shader, identifying a sequence of program instructions; determining whether the sequence of program instructions can be optimized based on the status of the bit; sourcing one or more interpolated texture map coordinates to thereby disable the perspective correction computation comprising division by (1/w); and enabling the optimized execution of one of a plurality of perspective computation functions by a sought operation in a shader unit without division of the interpolated texture maps coordinates by (1/w). The optimized function includes able mapping, projective mapping, normalization, or scaling invariant operations.