Abstract:
A texture processing pipeline is configured to store decoded texture data within a cache unit in order to expedite the processing of texture requests. When a texture request is processed, the texture processing pipeline queries the cache unit to determine whether the requested data is resident in the cache. If the data is not resident in the cache unit, a cache miss occurs. The texture processing pipeline then reads encoded texture data from global memory, decodes that data, and writes different portions of the decoded memory into the cache unit at specific locations according to a caching map. If the data is, in fact, resident in the cache unit, a cache hit occurs, and the texture processing pipeline then reads decoded portions of the requested texture data from the cache unit and combines those portions according to the caching map.
Abstract:
A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
Abstract:
A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
Abstract:
Approaches are disclosed for performing memory access operations in a texture processing pipeline having a first portion configured to process texture memory access operations and a second portion configured to process non-texture memory access operations. A texture unit receives a memory access request. The texture unit determines whether the memory access request includes a texture memory access operation. If the memory access request includes a texture memory access operation, then the texture unit processes the memory access request via at least the first portion of the texture processing pipeline, otherwise, the texture unit processes the memory access request via at least the second portion of the texture processing pipeline. One advantage of the disclosed approach is that the same processing and cache memory may be used for both texture operations and load/store operations to various other address spaces, leading to reduced surface area and power consumption.
Abstract:
A system, method, and computer program product for generating flow-control signals for a processing pipeline is disclosed. The method includes the steps of generating, by a first pipeline stage, a delayed ready signal based on a downstream ready signal received from a second pipeline stage and a throttle disable signal. A downstream valid signal is generated by the first pipeline stage based on an upstream valid signal and the delayed ready signal. An upstream ready signal is generated by the first pipeline stage based on the delayed ready signal and the downstream valid signal.
Abstract:
A texture processing pipeline can be configured to service memory access requests that represent texture data access operations or generic data access operations. When the texture processing pipeline receives a memory access request that represents a texture data access operation, the texture processing pipeline may retrieve texture data based on texture coordinates. When the memory access request represents a generic data access operation, the texture pipeline extracts a virtual address from the memory access request and then retrieves data based on the virtual address. The texture processing pipeline is also configured to cache generic data retrieved on behalf of a group of threads and to then invalidate that generic data when the group of threads exits.
Abstract:
Computer and graphics processing elements, connected generally in series, form a pipeline. Circuit elements known as di/dt throttles are inserted within the pipeline at strategic locations where the potential exists for data flow to transition from an idle state to a maximum data processing rate. The di/dt throttles gently ramp the rate of data flow from idle to a typical level. Disproportionate current draw and the consequent voltage droop are thus avoided, allowing an increased frequency of operation to be realized.
Abstract:
A tag unit configured to manage a cache unit includes a coalescer that implements a set hashing function. The set hashing function maps a virtual address to a particular content-addressable memory unit (CAM). The coalescer implements the set hashing function by splitting the virtual address into upper, middle, and lower portions. The upper portion is further divided into even-indexed bits and odd-indexed bits. The even-indexed bits are reduced to a single bit using a XOR tree, and the odd-indexed are reduced in like fashion. Those single bits are combined with the middle portion of the virtual address to provide a CAM number that identifies a particular CAM. The identified CAM is queried to determine the presence of a tag portion of the virtual address, indicating a cache hit or cache miss.
Abstract:
A system, method, and computer program product for generating flow-control signals for a processing pipeline is disclosed. The method includes the steps of generating, by a first pipeline stage, a delayed ready signal based on a downstream ready signal received from a second pipeline stage and a throttle disable signal. A downstream valid signal is generated by the first pipeline stage based on an upstream valid signal and the delayed ready signal. An upstream ready signal is generated by the first pipeline stage based on the delayed ready signal and the downstream valid signal.