摘要:
Prescient cache management methods and systems are disclosed. In one embodiment, a local cache that operates within a raster engine operations stage of a graphics rendering pipeline is managed by following a number of caching decisions related to a number of cached tiles. Each of these cached tiles has a certain priority to remain in the local cache, with the priority corresponding to a conflict type received from a buffer operating within a pre-raster engine operations stage of the graphics rendering pipeline.
摘要:
Prescient cache management methods and systems are disclosed. In one embodiment, within a pre-raster engine operations stage in a graphics rendering pipeline, tile entries are stored in a buffer. Each of these tile entries is related a transaction request that enters the pre-raster engine operations stage and has a screen coordinates field and a conflict field. If this buffer includes a first tile entry, which is related to a first transaction request associated with a first tile, and a second tile entry, which is related to a second transaction request that enters the pre-raster engine operations stage after the first transaction request and is also associated with the first tile, the conflict field of the first tile entry is updated with a conflict type that reflects a number of tile entries between the first tile entry and the second tile entry.
摘要:
A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of receiving a common input stream, tracking a periodic event associated with the common input stream, generating a plurality of fragment streams from the common input stream, inserting a marker based on an occurrence of the periodic event in a first fragment stream in the multiple fragment streams, and utilizing the marker to influence the processing of the first fragment stream so that a plurality of raster operation (ROP) request streams maintains substantially the same coherence as the common input stream. Each fragment stream is independently processed and corresponds to one of the ROP request streams.
摘要:
Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint on a display screen to a group of contiguous physical memory locations in a memory system, determining an anchor physical memory address from a first transaction associated with the footprint, wherein the anchor physical memory address corresponds to an anchor in the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits (LSBs) associated with the second transaction, and combining the anchor physical memory address with the set of LSBs associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.
摘要:
An event occurring in a graphics pipeline is detected and counted at the location of its occurrence using an event detector and a local counter. The event count maintained by the local counter is reported asynchronously to a global counter. The global counter is configured to be of higher precision than the local counter and is positioned at a place that is convenient for reporting the events, e.g., at the end of the graphics pipeline.
摘要:
Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint in screen space to a group of contiguous physical memory locations in a memory system, determining a first physical memory address for a first transaction associated with the footprint, wherein the first physical memory address is within the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits associated with the second transaction, and combining a portion of the first physical memory address with the set of least significant bits associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.
摘要:
A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
摘要:
A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
摘要:
A system and method for dynamically adjusting the pixel sampling rate during primitive shading can improve image quality or increase shading performance. Hybrid antialiasing is performed by selecting a number of shaded samples per pixel fragment. A combination of supersample and multisample antialiasing is used where a cluster of sub-pixel samples (multisamples) is processed for each pass through a fragment shader pipeline. The number of shader passes and multisamples in each cluster can be determined dynamically for each primitive based on rendering state.
摘要:
A system and method for dynamically adjusting the pixel sampling rate during primitive shading can improve image quality or increase shading performance. Hybrid antialiasing is performed by selecting a number of shaded samples per pixel fragment. A combination of supersample and multisample antialiasing is used where a cluster of sub-pixel samples (multisamples) is processed for each pass through a fragment shader pipeline. The number of shader passes and multisamples in each cluster can be determined dynamically for each primitive based on rendering state.