摘要:
An apparatus and method for enabling a cache controller and address and data buses of a microprocessor with an on-board cache to provide a SRAM test mode for testing the on-board cache. Upon assertion of a SRAM test signal to a SRAM test pin on the microprocessor chip, the cache and bus controllers cease normal functionality and permit data to be written to, and read from, individual addresses within the on-board cache as though the on-board cache is simple SRAM. After the chip is reset, standard SRAM tests can then be implemented by reading and writing data to selected cache memory addresses as though the cache memory were SRAM. Upon completion of the tests, the SRAM test signal is deasserted and the cache and bus controllers resume normal operating functionality. A reset signal is then applied to the microprocessor to reinitialize control logic employed within the microprocessor. In this way, cache memory on-board a microprocessor can be tested using standard SRAM testing algorithms and equipment thereby eliminating a need for specialized test equipment to test cache memory contained on a microprocessor chip.
摘要:
In a memory system having a main memory and a faster cache memory, a cache memory replacement scheme with a locking feature is provided. Locking bits associated with each line in the cache are supplied in the tag table. These locking bits are preferably set and reset by the application program/process executing and are utilized in conjunction with cache replacement bits by the cache controller to determine the lines in the cache to replace. The lock bits and replacement bits for a cache line are "ORed" to create a composite bit for the cache line. If the composite bit is set the cache line is not removed from the cache. When deadlock due to all composite bits being set will result, all replacement bits are cleared. One cache line is always maintained as non-lockable. The locking bits "lock" the line of data in the cache until such time when the process resets the lock bit. By providing that the process controls the state of the lock bits, the intelligence and knowledge the process contains regarding the frequency of use of certain memory locations can be utilized to provide a more efficient cache.
摘要:
A caching arrangement which can work efficiently in a superscaler and multiprocessing environment includes separate caches for instructions and data and a single translation lookaside buffer (TLB) shared by them. During each clock cycle, retrievals from both the instruction cache and data cache may be performed, one on the rising edge of the clock cycle and one on the falling edge. The TLB is capable of translating two addresses per clock cycle. Because the TLB is faster than accessing the tag arrays which in turn are faster than addressing the cache data arrays, virtual addresses may be concurrently supplied to all three components and the retrieval made in one phase of a clock cycle. When an instruction retrieval is being performed, snooping for snoop broadcasts may be performed for the data cache and vice versa. Thus, for every clock cycle, an instruction and data cache retrieval may be performed as well as snooping.
摘要:
A cache array, a cache tag and comparator unit and a cache multiplexor are provided to a cache memory. Each cache operation performed against the cache array, read or write, takes only half a clock cycle. The cache tag and comparator unit comprises a cache tag array, a cache miss buffer and control logic. Each cache operation performed against the cache tag array, read or write, also takes only half a clock cycle. The cache miss buffer comprises cache miss descriptive information identifying the current state of a cache fill in progress. The control logic comprises a plurality of combinatorial logics for performing tag match operations. In addition to standard tag match operations, the control logic also conditionally tag matches an accessing address against an address tag stored in the cache miss buffer. Depending on the results of the tag match operations, and further depending on the state of the current cache fill if the accessing address is part of the memory block frame of the current cache fill, the control logic provides appropriate signals to the cache array, the cache multiplexor, the main memory and the instruction/data destination. As a result, subsequent instruction/data requests that are part of a current cache fill in progress can be satisfied without having to wait for the completion of the current cache fill, thereby further reducing cache miss penalties and function unit idle time.
摘要:
Two independently accessible subdivided cache tag arrays and a cache control logic is provided to a set associative cache system. Each tag entry is stored in two subdivided cache tag arrays, a physical and a set tag array such that each physical tag array entry has a corresponding set tag array entry. Each physical tag array entry stores the tag addresses and control bits for a set of cache lines. The control bits comprise at least one validity bit indicating whether the data stored in the corresponding cache line is valid. Each set tag array entry stores the descriptive bits for a set of cache lines which consists of the most recently used (MRU) field identifying the most recently used cache lines of the cache set. Each subdivided tag array is provided with its own interface to enable each array to be accessed concurrently but independently by the cache control logic which performs read and write operations against the cache. The cache control logic makes concurrent and independent accesses to the separate tag arrays to read and write the control and descriptive information in the tag entries. The accesses are grouped by type of operation to be performed and each type of accesses is made during predesignated time slots in an optimized manner to enable the cache control logic to perform certain selected read/write accesses to the physical tag array while performing other selected independent read/write accesses to the set tag array concurrently.
摘要:
An apparatus and method for controlling the initialization of shifting circuitry which provides column redundancy for multiple banks of cache memory on-board a microprocessor. Upon sensing deassertion of a reset signal, a master controller supplies non-overlapping two phase clock signals to one bank controller for each bank of the cache memory. Each bank has a set of fuses which supply a bank shift location to the bank controller indicating the location of a bad column in the bank. The master controller also activates a pre-loadable counter which provides each bank controller with a signal which counts down to zero from half the maximum number of columns in a bank. Each bank controller then provides the shifting signals necessary to initialize the shifting circuitry for its bank. In this way, defective columns located in different positions in each bank can be replaced by redundant paths, thereby repairing the cache and increasing the manufacturing yield of microprocessors with an on-board cache memory.
摘要:
A method and apparatus for maintaining cache coherency in a multiprocessor system having a plurality of processors and a shared main memory. Each of the plurality of processors is coupled to at least one cache unit and a store buffer. The method comprises the steps of writing by a first cache unit to its first store buffer a dirty line when the first cache unit experiences a cache miss; gaining control of the bus by the first cache unit; reading a new line from the share main memory by the first cache unit through the bus; writing the dirty line to the shared main memory if the bus is available to the first cache unit and if not available, the first cache unit checking snooping by a second cache unit from a second processor; comparing an address from the second cache unit with the tag of the dirty line, wherein the tag is stored in content-addressable memory coupled to the store buffer and if there is a hit, then supplying the dirty line to the second cache unit for updating.
摘要:
A system and method of selecting a level of detail in a texture-mapping system. Pixels are processed in a zig-zag traversal pattern to allow determination of vertical and horizontal change values in texture map coordinates. In this manner, accurate level of detail selection is achieved without unduly reducing efficiency or throughput of the graphics system.
摘要:
A system and method of rendering polygons in graphics system using incremental iterative addition in place of complex division operations. A setup engine provides relevant values to edge and span walk modules for rapid processing and rendering of polygon characteristics including material values. Characteristic functions are iterated with respect to polygon area and along individual spans to derive values for each pixel therein.
摘要:
Performing graphics rendering without the computational expense of hither plane clipping and with only a minimum of display image clipping. Where a three dimensional polygon crosses to both sides of a hither plane, any vertices on the back side of the hither plane are translated to the hither plane, producing polygons which occupy only the area in front of the hither plane. A display image memory, from which display images are generated, is located within a larger guard memory such that many images which would need to be clipped to fit in the display image memory may be written to the guard memory without clipping.