摘要:
A circuit and method for iterative generation of the variables used in vector generation and linear interpolation. Most significant bits are added in a last pipeline stage. Less significant bits are added in earlier pipeline stages. Breaking addition into multiple parts with each part having fewer bits to add enables a faster iterative cycle rate compared to a single long adder. Part of the vector generation algorithm requires a decision step based on the sign of the complete addition. Since this sign is generated in the last stage of the pipeline, it is not available at the time needed by earlier stages of the pipeline. Therefore, all possible combinations of outcomes for earlier pipeline stages are simultaneously speculatively computed for use by following pipeline stages.
摘要:
Methods for texture mapping graphics primitives in a graphics pipeline architecture system. The methods utilize rectangular box filters to down-sample original texture maps thereby optimizing aliasing and blurring when graphics primitives have a two-dimensional texture mapped to a three-dimensional object. The methods of texture mapping graphics primitives in a frame buffer graphics system comprise the steps of determining an original texture map of two dimensions for a surface, storing the original texture map in the frame buffer, sampling the original texture map independently using an asymmetrical filter to construct multiple versions of a texture and to address textured pixels on a display in the frame buffer graphics systems, mapping the textured pixels to areas on the frame buffer, and displaying the textured graphics primitives on the display. A technique for addressing textured pixels stored in a rectangular texture (RIP) map is also described.
摘要:
In a method and apparatus that ensures data consistency between an I/O channel and a processor, system software issues an instruction which causes the issuance of a transaction when notification of a DMA completion is received. The transaction instructs the I/O channel to enforce coherency and then responds back only after coherency has been ensured. Specifically, a DMA.sub.-- SYNC transaction is broadcast to all I/O channels in the system. Responsive thereto, each I/O channel writes back to memory any modified lines in its cache that might contain DMA data for a DMA sequence that was reported by the system as completed. The I/O channels have a reporting means to indicate when this transaction is completed, so that the DMA.sub.-- SYNC transaction does not have to complete in pipeline order. Thus, the I/O channel can issue new transactions before responding to the DMA.sub.-- SYNC transaction.
摘要:
The present invention is generally directed to a device including an asynchronous input/output (I/O) data cache. The device includes a single data storage area that is disposed in communication with both a system data bus and a I/O data bus. Similarly, the device includes an address storage area that is configured to store system addresses corresponding to data contemporaneously stored in the data storage area. The device further includes a first circuit configured to indicate validity status of data within the data storage area for immediate access from the I/O data bus. A similar, second circuit is also included and configured to indicate validity status of data within the data storage area for immediate access from the system data bus. In accordance with another aspect of the present invention, a method is provided for buffering or caching data in a shared relationship between a system data bus and an input/output (I/O) data bus, which includes the steps of providing a single data storage area in communication with both a system data bus and an I/O data bus, and providing a single address storage area configured to store system memory addresses corresponding to data contemporaneously stored in the data storage area. In accordance with the broad aspect of the invention, the method further replicates a portion of validation circuitry in both a system frequency domain and an I/O frequency domain. In this way, latency delays encountered when crossing a frequency domain boundary are encountered at times outside a critical path.
摘要:
A computer graphics system includes a plurality of geometry accelerators for processing vertex data representative of graphics primitives and providing rendering data. The system includes a distributor responsive to a stream of vertex data for distributing to the geometry accelerators chunks of the vertex data for processing by the geometry accelerators to provide chunks of rendering data. The distributor generates an end of chunk bit indicative of the end of each of the chunks of vertex data. The system further includes a concentrator for receiving the chunks of rendering data from each of the geometry accelerators and for combining the chunks of rendering data into a stream of rendering data in response to end of chunk bits. The stream of rendering data and the stream of vertex data represent sequences of graphics primitives having the same order. A rasterizer generates pixel data representative of a graphics display in response to the stream of rendering data.
摘要:
A graphics system uses a programmable tile size and shape supported by a frame buffer memory organization wherein (X, Y) pixel addresses map into regularly offset permutations on groups of RAM address and data line assignments. Changing the mapping of (X, Y) pixel addresses to RAM addresses for the groups changes the size and shape of the tiles. A pixel data/partial address multiplexing method based on programmable tile size reduces the number of interconnections between a pixel interpolator and the frame buffer. A programmable pipelined shifter allows the dynamic alteration of the mapping between bits of the RGB intensity values and the planes of the frame buffer into which those bits are stored, as well as allowing those values to be truncated to specified lengths. Tiles are cached. Tiles for RGB pixel values are cached in an RGB cache, while Z values are cached in a separate cache. The Z buffer for hidden surface removal need not be a full size frame buffer, as a lesser portion of frame buffer is, if need be, used repeatedly. Updates to the color map are performed from a separate shadow RAM during vertical retrace. The shadow RAM is large enough to accommodate two copies of the color map, and can load them in automatic alternation, producing a blinking effect without the use of an additional plane of frame buffer memory.
摘要:
The present invention is generally directed to a system and method for fetching data from a system memory to an ATM card. The method includes the steps of receiving a request (via a PCI bus) to fetch data from memory, and identifying the request as an ATM request. The method then determines, based on the start address, the number of cache lines that will be implicated by the fetch. Then, the method automatically fetches the appropriate number of cache lines into the cache, and then passes the data to the ATM card, via the PCI bus. In accordance with another aspect of the present invention, a system is provided for fetching data from memory for an ATM card. Broadly, the system includes a system memory for data storage and a cache memory for providing high-speed (retrieval) temporary storage of data, the cache memory being disposed in communication with the system memory via a high-speed system bus. The system further includes a PCI bus in communication with the cache memory via an input/output (I/O) bus. A first mechanism is configured to identify a fetch for data from memory to the PCI bus by an ATM card. A second mechanism is configured to determine the number of lines of the cache memory that will be implicated by the identified fetch. Finally, a third mechanism is configured to automatically fetch the appropriate number of lines from the cache memory and to pass the data to the PCI bus.
摘要:
A graphics system uses a programmable tile size and shape supported by a frame buffer memory organization wherein (X, Y) pixel addresses map into regularly offset permutations on groups of RAM address and data line assignments. This allows one RAM in each group to be accessed with a memory cycle in unison with one RAM in each other group, up to the number of groups. During such a memory cycle each RAM can receive a different address. A tile is the collection of pixel locations associated with a collection of addresses sent to the RAM's. Because of the regular nature of the permutations these locations may be regions bounded by a single boundary that may be rectangular and of varying size and shape. Changing the mapping of (X, Y) pixel addresses to RAM addresses for the groups changes the size and shape of the tiles. Tiles are cached. Tiles for RGB pixel values are cached in an RGB cache, while Z values are cached in a separated cache. Caching allows the principle of locality to substitute shorter bit-cycles to the cache for memory cycles to the frame buffer, resulting in improved memory throughput. A group rotator and associated group-sized shift register per bit-plane cooperate during refresh to reorder and serialize the pixels of sixteen by one tiles.