摘要:
An API is provided to feed multiple data objects, wherever originated or located at the time of operation, to a 3D graphics chip simultaneously or in parallel. Multiple containers may be fed to a 3D graphics chip memory at the same time. In the case where data is being transmitted to a graphics chip memory, wherein the data includes the same spatial position of pixel(s), but only the orientation or color is changing, the data may be loaded into two separate containers, with a header description understood by the graphics chip and implemented by the graphics API, whereby a single copy of the position data can be loaded into one container, and the changing color or orientation data may be loaded into a second container. Thus, when received by the graphics chip, the data is loaded correctly into register space and processed according to the header description.
摘要:
Systems and methods for downloading algorithmic elements to a coprocessor and corresponding processing and communication techniques are provided. For an improved graphics pipeline, the invention provides a class of co-processing device, such as a graphics processor unit (GPU), providing improved capabilities for an abstract or virtual machine for performing graphics calculations and rendering. The invention allows for runtime-predicated flow control of programs downloaded to coprocessors, enables coprocessors to include indexable arrays of on-chip storage elements that are readable and writable during execution of programs, provides native support for textures and texture maps and corresponding operations in a vertex shader, provides frequency division of vertex streams input to a vertex shader with optional support for a stream modulo value, provides a register storage element on a pixel shader and associated interfaces for storage associated with representing the “face” of a pixel, provides vertex shaders and pixel shaders with more on-chip register storage and the ability to receive larger programs than any existing vertex or pixel shaders and provides 32 bit float number support in both vertex and pixel shaders.
摘要:
A three-dimensional API for communicating with hardware implementations of vertex shaders and pixel shaders having local registers. With respect to vertex shaders, API communications are provided that may make use of an on-chip register index and API communications are also provided for a specialized function, implemented on-chip at a register level, that outputs the fractional portion(s) of input(s). With respect to pixel shaders, API communications are provided for a specialized function, implemented on-chip at a register level, that performs a linear interpolation function and API communications are provided for specialized modifiers, also implemented on-chip at a register level, that perform modification functions including negating, complementing, remapping, stick biasing, scaling and saturating. Advantageously, these API communications expose these very useful on-chip graphical algorithmic elements to a developer while hiding the details of the operation of the vertex shader and pixel shader chips from the developer.
摘要:
An API is provided to feed multiple data objects, wherever originated or located at the time of operation, to a 3D graphics chip simultaneously or in parallel. Multiple containers may be fed to a 3D graphics chip memory at the same time. In the case where data is being transmitted to a graphics chip memory, wherein the data includes the same spatial position of pixel(s), but only the orientation or color is changing, the data may be loaded into two separate containers, with a header description understood by the graphics chip and implemented by the graphics API, whereby a single copy of the position data can be loaded into one container, and the changing color or orientation data may be loaded into a second container. Thus, when received by the graphics chip, the data is loaded correctly into register space and processed according to the header description.
摘要:
A system for improved shadowing of images using a multiple pass, depth buffer approach includes rendering a scene from the perspective of a light source to construct a shadow depth map in a rasterization buffer. The system computes depth values for the two nearest geometric primitives to the light source for pixels, and stores these depth values in the rasterization buffer. Once the shadow map is constructed, it is stored in shared memory, where it can be retrieved for subsequent rendering passes. The two depth values for each element in the shadow map can be used in combination with a global bias to eliminate self-shadowing artifacts and avoid artifacts in the terminator region. The system supports linear or higher order filtering of data from the shadow depth map to produce smoother transitions from shadowed and un-shadowed portions of an image. In addition, the system supports the re-use of the shadow map and shadowed images for more than one frame.
摘要:
A gsprite engine circuit reads a display list identifying gsprite image layers to be composited for display, retrieves gsprite image data from an external memory, and transforms the gsprite data to display device coordinates. The gsprite image layers represent independently rendered graphical objects in a graphics scene. The gsprite engine can simulate the motion of the graphical objects in a sequence of display images by performing affine transformations on the gsprite image layers. The interface to the gsprite engine circuit includes the display list and gsprite header blocks. The display list enumerates the gsprites to be composited as a display image. The header blocks describe a gsprite transform, which can be an affine transform, used to transform gsprites to display device coordinates. The header blocks also provide an array of references to image blocks or "chunks" comprising the gsprite.
摘要:
In a graphics rendering system, an apparatus for resolving depth sorted lists of pixel fragments includes color and alpha accumulators for computing color and alpha values from the pixel fragments in a fragment list. Pixel fragments include color, alpha, coverage data. The coverage data describes how a geometric primitive covers sub-pixel regions of a pixel using a coverage mask. Pixel circuitry according to a clock-optimized approach includes separate color and alpha accumulators for computing color and alpha values for sub-pixel regions of a pixel. The accumulated color values are then summed and scaled to compute final color values for a pixel. To reduce hardware requirements, pixel circuitry in a hardware-optimized approach recognizes that some pixel regions have common accumulated alpha values as each fragment layer is processed. As such, color contributions for fragment layers can be computed using a single color accumulation operation for a pixel region having common alpha values.
摘要:
An enhanced graphics pipeline is provided that enables common core hardware to perform as different components of the graphics pipeline, programmability of primitives including lines and triangles by a component in the pipeline, and a stream output before or simultaneously with the rendering a graphical display with the data in the pipeline. The programmer does not have to optimize the code, as the common core will balance the load of functions necessary and dynamically allocate those instructions on the common core hardware. The programmer may program primitives using algorithms to simplify all vertex calculations by substituting with topology made with lines and triangles. The programmer takes the calculated output data and can read it before or while it is being rendered. Thus, a programmer has greater flexibility in programming. By using the enhanced graphics pipeline, the programmer can optimize the usage of the hardware in the pipeline, program vertex, line or triangle topologies altogether rather than each vertex alone, and read any calculated data from memory where the pipeline can output the calculated information.
摘要:
Systems and methods for downloading algorithmic elements to a coprocessor and corresponding processing and communication techniques are provided. For an improved graphics pipeline, the invention provides a class of co-processing device, such as a graphics processor unit (GPU), providing improved capabilities for an abstract or virtual machine for performing graphics calculations and rendering. The invention allows for runtime-predicated flow control of programs downloaded to coprocessors, enables coprocessors to include indexable arrays of on-chip storage elements that are readable and writable during execution of programs, provides native support for textures and texture maps and corresponding operations in a vertex shader, provides frequency division of vertex streams input to a vertex shader with optional support for a stream modulo value, provides a register storage element on a pixel shader and associated interfaces for storage associated with representing the “face” of a pixel, provides vertex shaders and pixel shaders with more on-chip register storage and the ability to receive larger programs than any existing vertex or pixel shaders and provides 32 bit float number support in both vertex and pixel shaders.
摘要:
An enhanced graphics pipeline is provided that enables common core hardware to perform as different components of the graphics pipeline, programmability of primitives including lines and triangles by a component in the pipeline, and a stream output before or simultaneously with the rendering a graphical display with the data in the pipeline. The programmer does not have to optimize the code, as the common core will balance the load of functions necessary and dynamically allocate those instructions on the common core hardware. The programmer may program primitives using algorithms to simplify all vertex calculations by substituting with topology made with lines and triangles. The programmer takes the calculated output data and can read it before or while it is being rendered. Thus, a programmer has greater flexibility in programming. By using the enhanced graphics pipeline, the programmer can optimize the usage of the hardware in the pipeline, program vertex, line or triangle topologies altogether rather than each vertex alone, and read any calculated data from memory where the pipeline can output the calculated information.