摘要:
A system and method for performing zero-bandwidth-clears reduces external memory accesses by a graphics processor when performing clears and subsequent read operations. A set of clear values is stored in the graphics processor. Each portion of a color or z buffer may be configured using a zero-bandwidth-clear command to reference a clear value without writing the external memory. The clear value is provided to a requestor without accessing the external memory when a read access is performed.
摘要:
A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of receiving a common input stream, tracking a periodic event associated with the common input stream, generating a plurality of fragment streams from the common input stream, inserting a marker based on an occurrence of the periodic event in a first fragment stream in the multiple fragment streams, and utilizing the marker to influence the processing of the first fragment stream so that a plurality of raster operation (ROP) request streams maintains substantially the same coherence as the common input stream. Each fragment stream is independently processed and corresponds to one of the ROP request streams.
摘要:
Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint on a display screen to a group of contiguous physical memory locations in a memory system, determining an anchor physical memory address from a first transaction associated with the footprint, wherein the anchor physical memory address corresponds to an anchor in the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits (LSBs) associated with the second transaction, and combining the anchor physical memory address with the set of LSBs associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.
摘要:
Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint in screen space to a group of contiguous physical memory locations in a memory system, determining a first physical memory address for a first transaction associated with the footprint, wherein the first physical memory address is within the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits associated with the second transaction, and combining a portion of the first physical memory address with the set of least significant bits associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.
摘要:
A system and method for dynamically adjusting the pixel sampling rate during primitive shading can improve image quality or increase shading performance. Hybrid antialiasing is performed by selecting a number of shaded samples per pixel fragment. A combination of supersample and multisample antialiasing is used where a cluster of sub-pixel samples (multisamples) is processed for each pass through a fragment shader pipeline. The number of shader passes and multisamples in each cluster can be determined dynamically for each primitive based on rendering state.
摘要:
A method and system are disclosed for antialiased rendering a plurality of pixels in a computer system. The method and system comprise providing a fixed storage area and providing a plurality of sequential format levels for the plurality of pixels within the fixed storage area. The plurality of format levels represent pixels with varying degrees of complexity in subpixel geometry visible within the pixel. A system and method in accordance with the present invention provides at least the following format levels: one-fragment format, used when one surface fully covers a pixel; two-fragment format, used when two surfaces together cover a pixel; and multisample format, used when three or more surfaces cover a pixel. The method and system further comprise storing the plurality of pixels at a lowest appropriate format level within the fixed storage area, so that a minimum amount of data is transferred to and from the fixed storage area. The method and system further comprise procedures for converting pixels from one format level to take into account newly rendered pixel fragments. All formats represent depth values in a consistent manner so that fragments rendered during later rendering passes match depth values resulting from rendering the same primitive in earlier passes. Thus, the invention enables high-quality antialiasing with minimal data transferred to and from the fixed storage area, while supporting multi-pass rendering.
摘要:
A method and system for selective enablement of tile compression. The method includes receiving a graphics primitive for processing in a set-up unit of a graphics processor and determining a primitive characteristic that indicates a probability of whether a final compression of a tile related to the primitive will be retained. Compression for the tile related to the primitive is allowed when the characteristic indicates the final compression will be retained. Compression for the tile related to the primitive is disallowed in the characteristic indicates the final compression will not be retained.
摘要:
Write operations to a unit of compressible memory, known as a compression tile, are examined to see if data blocks to be written completely cover a single compression tile. If the data blocks completely cover a single compression tile, the write operations are coalesced into a single write operation and the single compression tile is overwritten with the data blocks. Coalescing multiple write operations into a single write operation improves performance, because it avoids the read-modify-write operations that would otherwise be needed.
摘要:
A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. A crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions.
摘要:
One embodiment of the present invention sets forth a technique for rendering graphics primitives in parallel while maintaining the API primitive ordering. Multiple, independent geometry units perform geometry processing concurrently on different graphics primitives. A primitive distribution scheme delivers primitives concurrently to multiple rasterizers at rates of multiple primitives per clock while maintaining the primitive ordering for each pixel. The multiple, independent rasterizer units perform rasterization concurrently on one or more graphics primitives, enabling the rendering of multiple primitives per system clock.