摘要:
A 3D rendering texture caching scheme that minimizes external bandwidth requirements for texture and increases the rate at which textured pixels are available. The texture caching scheme efficiently pre-fetches data at the main memory access granularity and stores it in cache memory. The data in the main memory and texture cache memory is organized in a manner to achieve large reuse of texels with a minimum of cache memory to minimize cache misses. The texture main memory stores a two dimensional array of texels, each texel having an address and one of N identifiers. The texture cache memory has addresses partitioned into N banks, each bank containing texels transferred from the main memory that have the corresponding identifier. A cache controller determines which texels need to be transferred from the texture main memory to the texture cache memory and which texels are currently in the cache using a least most recently used algorithm. By labeling the texture map blocks (double quad words), a partitioning scheme is developed which allow the cache controller structure to be very modular and easily realized. The texture cache arbiter is used for scheduling and controlling the actual transfer of texels from the texture main memory into the texture cache memory and controlling the outputting of texels for each pixel to an interpolating filter from the cache memory.
摘要:
A rasterizer comprised of a bounding box calculator, a plane converter, a windower, and incrementers. For each polygon to be processed, a bounding box calculation is performed which determines the display screen area, in spans, that totally encloses the polygon and passes the data to the plane converter. The plane converter also receives as input attribute values for each vertex of the polygon. The plane converter computes planar coefficients for each attribute of the polygon, for each of the edges of the polygon. The plane converter unit computes the start pixel center location at a start span and a starting coefficient value at that pixel center. The computed coefficients also include the rate of change or gradient, for each polygon attribute in the x and y directions, respectively. The plane converter also computes line coefficients for each of the edges of the polygon. Line equation values are passed through to the windower where further calculations allow the windower to determine which spans are either covered or intersected by the polygon. The incrementers receive the span coverage data from the windower in addition to receiving planar coefficient values from the plane converter. The incrementers utilize the data from both the windower and plane converter to walk or traverse the polygon in those intersected spans, pixel by pixel. As the incrementer visits each pixel, vertex attribute values are interpolated to each pixel.
摘要:
A method and system for rendering a feature, such as a line, for display on an array of pixels. With this method, the line is identified on the pixel array, the line is expanded into a polygon, and color values are determined for the pixels within the polygon. Also, an antialiasing region is identified in the polygon, and blend values are computed for the pixels in this antialiasing region. Then, the color values determined for the pixels in the antialiasing region are modified as a function of these computed blend values. The pixels in the antialiasing region may then be shown at their modified color values, while the pixels that are in the polygon but not in the antialising region may be shown at their original determined color value. Preferably, the blend values for the pixels in the antialiasing region are calculated as a function of the locations of the pixels in that region. For example, the blend value for each of these pixels may be calculated as a function of four values, each one representing the Manhattan distance from the pixel to a respective one of the edges of the polygon. Also, preferably the antialiasing region has a uniform width, and this region extends inward from side edges and outward from end edges of the formed polygon.
摘要:
Method and apparatus for rendering texture to an object to be displayed on a pixel screen display. This technique makes use of linear interpolation between perspectively correct texture address to calculate rates of change of individual texture addresses components to determine a selection of the correct LOD map to use and intermediate texture addresses for pixels of the object between the perspectively correct addresses. The method first determines perspectively correct texture address values associated with four corners of a predefined span or grid of pixels. Then, a linear interpolation technique is implemented to calculate a rate of change of texture address components in the screen x and y directions for pixels between the perspectively bound span corners. This linear interpolation technique is performed in both screen directions to thereby create a potentially unique level of detail value for each pixel, which is then used as an index to select the correct pre-filtered LOD texture map. When mapping an individually determined LOD value per pixel, the effect of producing undesirable artifacts that may appear if a single LOD for an entire span or polygon is used, is obviated.
摘要:
Method and apparatus for rendering texture to an object to be displayed on a pixel screen display. This technique makes use of linear interpolation between perspectively correct texture address to calculate rates of change of individual texture addresses components to determine a selection of the correct LOD map to use and intermediate texture addresses for pixels of the object between the perspectively correct addresses. The method first determines perspectively correct texture address values associated with four corners of a predefined span or grid of pixels. Then, a linear interpolation technique is implemented to calculate a rate of change of texture address components in the screen x and y directions for pixels between the perspectively bound span corners. This linear interpolation technique is performed in both screen directions to thereby create a potentially unique level of detail value for each pixel, which is then used as an index to select the correct pre-filtered LOD texture map. When mapping an individually determined LOD value per pixel, the effect of producing undesirable artifacts that may appear if a single LOD for an entire span or polygon is used, is obviated.
摘要:
A computationally efficient method for minimizing the visible effects of texture LOD transitions across a polygon. The minimization is accomplished by adding a dithering offset value to the LOD value computed for each pixel covered by a graphics primitive to produce a dithered pixel LOD value. The dithering offsets mat be generated from a table look-up based on the location of the pixel within a span of pixels. The dithered pixel LOD value is used to as an index in the selection of a single LOD texture map from which a textured pixel value is retrieved. The range of dithering offset values can be adjusted by modulating the values in the table look-up.
摘要:
In accordance with some embodiments, a scatter/gather memory approach may be enabled that is exposed or backed by system memory and uses conventional tags and addresses. Thus, such a technique may be more amenable to conventional software developers and their conventional techniques.
摘要:
In some embodiments, a method includes receiving a request to generate a thread and supplying a request to a queue in response at least to the received request. The method may further include fetching a plurality of instructions in response at least in part to the request supplied to the queue and executing at least one of the plurality of instructions. In some embodiments, an apparatus includes a storage medium having stored therein instructions that when executed by a machine result in the method. In some embodiments, an apparatus includes circuitry to receive a request to generate a thread and to queue a request to generate a thread in response at least to the received request. In some embodiments, a system includes circuitry to receive a request to generate a thread and to queue a request to generate a thread in response at least to the received request, and a memory unit to store at least one instruction for the thread.
摘要:
A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.
摘要:
A technique to enable information sharing among agents within different cache coherency domains. In one embodiment, a graphics device may use one or more caches used by one or more processing cores to store or read information, which may be accessed by one or more processing cores in a manner that does not affect programming and coherency rules pertaining to the graphics device.