Abstract:
In accordance with some embodiments, domain shader and/or tessellator operations can be eliminated when they are redundant. By using a corner cache, a check can determine whether a given corner, be it a vertex or a quadrilateral corner, has already been evaluated in the domain shader and/or tessellator and if so, the result of the previous operation can be reused instead of performing unnecessary invocations that may increase power consumption or reduce speed.
Abstract:
One embodiment provides a graphics processor comprising a set of processing resources configured to perform a supersampling operation via a mixed precision convolutional neural network, the set of processing resources including circuitry configured to receive, at an input block of a neural network model, history data, velocity data, and current frame data, pre-process the history data, velocity data, and current frame data to generate pre-processed data, provide the pre-processed data to a feature extraction network of the neural network model, process the pre-processed data at the feature extraction network via one or more encoder stages and one or more decoder stages, and generate an output image via an output block of the neural network model via direct reconstruction or kernel prediction.
Abstract:
A method, one or more non-transitory computer readable media, and an apparatus for implementing a reduced precision bounding volume hierarchy ray traversal for graphics processing are disclosed. The method includes the step of reusing, in a child node, a computation for a parent node in a reduced precision bounding volume hierarchy ray traversal for graphics processing. The computational cost of the reduced precision bounding volume hierarchy ray traversal can be reduced by reusing, in the child node, the computation for the parent node.
Abstract:
Joint denoising and supersampling of graphics data is described. An example of a graphics processor includes multiple processing resources, including a least a first processing resource including a pipeline to perform a supersampling operation; and the pipeline including circuitry to jointly perform denoising and supersampling of received ray tracing input data, the circuitry including first circuitry to receive input data associated with an input block for a neural network, second circuitry to perform operations associated with a feature extraction and kernel prediction network of the neural network, and third circuitry to perform operations associated with a filtering block of the neural network.
Abstract:
Graphics processors of the present design provide hierarchical open sectors and variable cache sizes for cache operations. In one embodiment, a graphics processor comprises a cache memory having a hierarchical open sector design including a first hierarchy of upper and lower regions with each region including a second hierarchy of sectors. A cache controller is configured to initially open a first sector of the lower region, to receive a memory request that does not match an address in the first sector, and to open a second sector of the lower region.
Abstract:
A graphics pipeline combines the benefits of decoupling sampling with deferred shading. In the rasterization phase, a shading point is computed for each sample. After rasterization is finished, the shading points are sorted to extract coherence and groups of shading points shaded. This enables high sampling rates with efficient reuse of shading, in addition to other unique benefits.
Abstract:
Apparatus and method for efficient graphics processing including ray tracing. For example, one embodiment of a graphics processor comprises: execution hardware logic to execute graphics commands and render images; an interface to couple functional units of the execution hardware logic to a tiled resource; and a tiled resource manager to manage access by the functional units to the tiled resource, a functional unit of the execution hardware logic to generate a request with a hash identifier (ID) to request access to a portion of the tiled resource, wherein the tiled resource manager is to determine whether a portion of the tiled resource identified by the hash ID exists, and if not, to allocate a new portion of the tiled resource and associate the new portion with the hash ID.
Abstract:
An apparatus and method for performing coarse pixel shading (CPS). For example, one embodiment of a method comprises: A method for coarse pixel shading (CPS) comprising: pre-processing a graphics mesh by creating a tangent-plane parameterization of desired vertex attributes for each vertex of the mesh; and performing rasterization of the mesh in a rasterization stage of a graphics pipeline using the tangent-plane parameterization.
Abstract:
The problem of generating high quality images with a rendering pipeline based on decoupled sampling may be addressed by generating non-extrapolated shading locations and by determining improved texture filtering footprints. This may be accomplished by performing shading at the center of a bounding box that bounds mapped shading samples.
Abstract:
Apparatus and method for efficient graphics processing including ray tracing. For example, one embodiment of a graphics processor comprises: execution hardware logic to execute graphics commands and render images; an interface to couple functional units of the execution hardware logic to a tiled resource; and a tiled resource manager to manage access by the functional units to the tiled resource, a functional unit of the execution hardware logic to generate a request with a hash identifier (ID) to request access to a portion of the tiled resource, wherein the tiled resource manager is to determine whether a portion of the tiled resource identified by the hash ID exists, and if not, to allocate a new portion of the tiled resource and associate the new portion with the hash ID.