摘要:
One embodiment of the present invention sets forth a technique for using a multi-bank register file that reduces the size of or eliminates a switch and/or staging registers that are used to gather input operands for instructions. Each function unit input may be directly connected to one bank of the multi-bank register file with neither a switch nor a staging register. A compiler or register allocation unit ensures that the register file accesses for each instruction are conflict-free (no instruction can access the same bank more than once in the same cycle). The compiler or register allocation unit may also ensure that the register file accesses for each instruction are also aligned (each input of a function unit can only come from the bank connected to that input).
摘要:
One embodiment of the present invention sets forth a technique for using a multi-bank register file that reduces the size of or eliminates a switch and/or staging registers that are used to gather input operands for instructions. Each function unit input may be directly connected to one bank of the multi-bank register file with neither a switch nor a staging register. A compiler or register allocation unit ensures that the register file accesses for each instruction are conflict-free (no instruction can access the same bank more than once in the same cycle). The compiler or register allocation unit may also ensure that the register file accesses for each instruction are also aligned (each input of a function unit can only come from the bank connected to that input).
摘要:
The grid walk sampling technique is an efficient sampling algorithm aimed at optimizing the cost of triangle rasterization for modern graphics workloads. Grid walk sampling is an iterative rasterization algorithm that intelligently tests the intersection of triangle edges with multi-cell grids, determining coverage for a grid cell while identifying other cells in the grid that are either fully covered or fully uncovered by the triangle. Grid walk sampling rasterizes triangles using fewer computations and simpler computations compared with conventional highly parallel rasterizers. Therefore, a rasterizer employing grid walk sampling may compute sample coverage of triangles more efficiently in terms of power and circuitry die area compared with conventional highly parallel rasterizers.
摘要:
A technique for caching coverage information for edges that are shared between adjacent graphics primitives may reduce the number of times a shared edge is rasterized. Consequently, power consumed during rasterization may be reduced. During rasterization of a first graphics primitive coverage information is generated that (1) indicates cells within a sampling grid that are entirely outside an edge of the first graphics primitive and (2) indicates cells within the sampling grid that are intersected by the edge and are only partially covered by the first graphics primitive. The coverage information for the edge is stored in a cache. When a second graphics primitive is rasterized that shares the edge with the first graphics primitive, the coverage information is read from the cache instead of being recomputed.
摘要:
A technique for caching coverage information for edges that are shared between adjacent graphics primitives may reduce the number of times a shared edge is rasterized. Consequently, power consumed during rasterization may be reduced. During rasterization of a first graphics primitive coverage information is generated that (1) indicates cells within a sampling grid that are entirely outside an edge of the first graphics primitive and (2) indicates cells within the sampling grid that are intersected by the edge and are only partially covered by the first graphics primitive. The coverage information for the edge is stored in a cache. When a second graphics primitive is rasterized that shares the edge with the first graphics primitive, the coverage information is read from the cache instead of being recomputed.