Abstract:
A graphics processing subsystem and method for computing a 3D clipmap. One embodiment of the subsystem includes: (1) a renderer operable to render a primitive surface representable by a 3D clipmap, (2) a geometry shader (GS) configured to select respective major-plane viewports for a plurality of clipmap levels, the major-plane viewports being sized to represent full spatial extents of the 3D clipmap relative to a render target (RT) for the plurality of clipmap levels, (3) a rasterizer configured to employ the respective major-plane viewports and the RT to rasterize a projection of the primitive surface onto a major plane corresponding to the respective major-plane viewports into pixels representing fragments of the primitive surface for each of the plurality of clipmap levels, and (4) a plurality of pixel shader (PS) instances configured to transform the fragments into respective voxels in the plurality of clipmap levels, thereby voxelizing the primitive surface.
Abstract:
In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images. In operation, the parallel processor causes execution threads to execute a task shading program on an input mesh to generate a task shader output specifying a mesh shader count. The parallel processor then generates mesh shader identifiers, where the total number of the mesh shader identifiers equals the mesh shader count. For each mesh shader identifier, the parallel processor invokes a mesh shader based on the mesh shader identifier and the task shader output to generate geometry associated with the mesh shader identifier. Subsequently, the parallel processor performs operations on the geometries associated with the mesh shader identifiers to generate a rendered image. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
Abstract:
A computing system and method for representing volumetric data for a scene. One embodiment of the computing system includes: (1) a memory configured to store a three-dimensional (3D) clipmap data structure having at least one clip level and at least one mip level, and (2) a processor configured to generate voxelized data for a scene and cause the voxelized data to be stored in the 3D clipmap data structure.
Abstract:
A graphics processing subsystem and method for computing a 3D clipmap. One embodiment of the subsystem includes: (1) a renderer operable to render a primitive surface representable by a 3D clipmap, (2) a geometry shader (GS) configured to select respective major-plane viewports for a plurality of clipmap levels, the major-plane viewports being sized to represent full spatial extents of the 3D clipmap relative to a render target (RT) for the plurality of clipmap levels, (3) a rasterizer configured to employ the respective major-plane viewports and the RT to rasterize a projection of the primitive surface onto a major plane corresponding to the respective major-plane viewports into pixels representing fragments of the primitive surface for each of the plurality of clipmap levels, and (4) a plurality of pixel shader (PS) instances configured to transform the fragments into respective voxels in the plurality of clipmap levels, thereby voxelizing the primitive surface.
Abstract:
A computing system and method for representing volumetric data for a scene. One embodiment of the computing system includes: (1) a memory configured to store a three-dimensional (3D) clipmap data structure having at least one clip level and at least one mip level, and (2) a processor configured to generate voxelized data for a scene and cause the voxelized data to be stored in the 3D clipmap data structure.
Abstract:
A Displaced Micro-mesh (DMM) primitive enables high complexity geometry for ray and path tracing while minimizing the associated builder costs and preserving high efficiency. A structured, hierarchical representation implicitly encodes vertex positions of a triangle micro-mesh based on a barycentric grid, and enables microvertex displacements to be encoded efficiently (e.g., as scalars linearly interpolated between minimum and maximum triangle surfaces). The resulting displaced micro-mesh primitive provides a highly compressed representation of a potentially vast number of displaced microtriangles that can be stored in a small amount of space. Improvements in ray tracing hardware permit automatic processing of such primitive for ray-geometry intersection testing by ray tracing circuits without requiring intermediate reporting to a shader.
Abstract:
A Displaced Micro-mesh (DMM) primitive enables high complexity geometry for ray and path tracing while minimizing the associated builder costs and preserving high efficiency. A structured, hierarchical representation implicitly encodes vertex positions of a triangle micro-mesh based on a barycentric grid, and enables microvertex displacements to be encoded efficiently (e.g., as scalars linearly interpolated between minimum and maximum triangle surfaces). The resulting displaced micro-mesh primitive provides a highly compressed representation of a potentially vast number of displaced microtriangles that can be stored in a small amount of space. Improvements in ray tracing hardware permit automatic processing of such primitive for ray-geometry intersection testing by ray tracing circuits without requiring intermediate reporting to a shader.
Abstract:
The present technology augments the GPU compute model to provide system-provided data marshalling characteristics of graphics pipelining to increase efficiency and reduce overhead. A simple scheduling model based on scalar counters (e.g., semaphores) abstract the availability of hardware resources. Resource releases can be done programmatically, and a system scheduler only needs to track the states of such counters/semaphores to make work launch decisions. Semantics of the counters/semaphores are defined by an application, which can use the counters/semaphores to represent the availability of free space in a memory buffer, the amount of cache pressure induced by the data flow in the network, or the presence of work items to be processed.
Abstract:
In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images via a shading program. In operation, the parallel processor causes a first set of execution threads to execute the shading program on a first portion of the input mesh to generate first geometry stored in an on-chip memory. The parallel processor also causes a second set of execution threads to execute the mesh shading program on a second portion of the input mesh to generate second geometry stored in the on-chip memory. Subsequently, the parallel processor reads the first geometry and the second geometry from the on-chip memory, and performs operations on the first geometry and the second geometry to generate a rendered image derived from the input mesh. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.