Abstract:
In position-only shading, two geometry pipes exist, a trimmed down version called the Cull Pipe and a full version called the Replay Pipe. Thus, the Cull Pipe executes the position shaders in parallel with the main application, but typically generates the critical results much faster as it fetches and shades only the position attribute of the vertices and avoids the rasterization as well as the rendering of pixels for the frame buffer. Furthermore, the Cull Pipe uses these critical results to compute visibility information for all the triangles whether they are culled or not. On the other hand, the Replay Pipe consumes the visibility information to skip the culled triangles and shades only the visible triangles that are finally passed to the rasterization phase. Together the two pipes can hide the long cull runs of discarded triangles and can complete the work faster in some embodiments.
Abstract:
By packing the depth data in a way that is independent of the number of samples, so that memory bandwidth is the same regardless of the number of samples, higher numbers of samples per pixel may be used without adversely affecting buffer cost. In some embodiments, the number of pixels per clock in a first level depth test may be increased by operating in the pixel domain, whereas previous solutions operated at the sample level.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.
Abstract:
An apparatus and method are described for performing efficient depth test operations. For example, an apparatus in accordance with one embodiment comprises: a depth cache to store a plurality of cache lines containing depth data to be used for graphics processing operations; depth test logic to determine a current depth test function associated with a read operation and to read a cache line from a depth cache while there are still outstanding writes to the cache line if the read operation and write operation are associated with the same depth test function, the depth test logic to perform a first depth test using the data read from the cache line, the first depth test to fail or pass pixels based on a predicted range of depth values.
Abstract:
One or more apparatus and method for multi-pixel/sample level depth testing in a graphics processor is described. In embodiments, a bounding-box of variable size over which a depth test is to be performed is determined based on the pattern of lit pixels or samples within rasterizer tile. A multi-corner depth test may be performed between a source depth data plane and a destination depth plane within a source depth data bound where destination depth data is continuous within the source data bound. A range-based depth test may be performed in response to the destination data being discontinuous. Source depth data prevailing in the depth test may be stored in a compressed plane equation format in response to the source data being continuous within the source data bound, and may be stored as min/max depth data if discontinuous.