摘要:
A data caching system and method includes a data store for caching data from a main memory, a primary tag array for holding tags associated with data cached in the data store, and a duplicate tag array which holds copies of the tags held in the primary tag array. The duplicate tag array is accessible by functions, such as external memory cache probes, such that the primary tag remains available to the processor core. An address translator maps virtual page addresses to physical page address. In order to allow a data caching system which is larger than a page size, a portion of the virtual page address is used to index the tag arrays and data store. However, because of the virtual to physical mapping, the data may reside in any of a number of physical locations. During an internally-generated memory access, the virtual address is used to look up the cache. If there is a miss, other combinations of values are substituted for the virtual bits of the tag array index. For external probes which provide physical addresses to the duplicate tag array, combinations of values are appended to the index portion of the physical address. Tag array lookups can be performed either sequentially, or in parallel.
摘要:
A technique for implementing load-locked and store-conditional instruction primitives by using a local cache for information about exclusive ownership. The valid bit in particular provides information to properly execute load-locked and store-conditional instructions without the need for lock flag or local lock address registers for each individual locked address. Integrity of locked data is accomplished by insuring that load-locked and store-conditional instructions are processed in order, that no internal agents can evict blocks from a local cache as a side effect as their processing, that external agents update the context of cache memories first using invalidating probe commands, and that only non-speculative instructions are permitted to generate external commands.
摘要:
Use of an internal processor data bus is maximized in a system where external transactions may occur at a rate which is fractionally slower than the rate of the internal transactions. The technique inserts a selectable delay element in the signal path during an external operation such as a cache fill operation. The one cycle delay provides a time slot in which an internal operation, such as a load from an internal cache, may be performed. This technique therefore permits full use of the time slots on the internal data bus. It can, for, example, allow load operations to begin at a much earlier time than would otherwise be possible in architectures where fill operations can consume multiple bus time slots.
摘要:
A data caching system comprises a hashing function, a data store, a tag array, a page translator, a comparator and a duplicate tag array. The hashing function combines an index portion of a virtual address with a virtual page portion of the virtual address to form a cache index. The data store comprises a plurality of data blocks for holding data. The tag array comprises a plurality of tag entries corresponding to the data blocks, and both the data store and tag array are addressed with the cache index. The tag array provides a plurality of physical address tags corresponding to physical addresses of data resident within corresponding data blocks in the data store addressed by the cache index. The page translator translates a tag portion of the virtual address to a corresponding physical address tag. The comparator verifies a match between the physical address tag from the page translator and the plurality of physical address tags from the tag array, a match indicating that data addressed by the virtual address is resident within the data store. Finally, the duplicate tag array resolves synonym issues caused by hashing. The hashing function is such that addresses which are equivalent mod 213 are pseudo-randomly displaced within the cache. The preferred hashing function maps VA to bits of the cache index.
摘要:
A computing apparatus connectable to a cache and a memory, includes a system port configured to receive an atomic probe command or a system data control response command having an address part identifying data stored in the cache which is associated with data stored in the memory and a next coherence state part indicating a next state of the data in the cache. The computing apparatus further includes an execution unit configured to execute the command to change the state of the data stored in the cache according to the next coherence state part of the command.
摘要:
A computer system includes an external unit governing a cache which generates a set-dirty request as a function of a coherence state of a block in the cache to be modified. The external unit modifies the block of the cache only if an acknowledgment granting permission is received from a memory management system responsive to the set-dirty request. The memory management system receives the set-dirty request, determines the acknowledgment based on contents of the plurality of caches and the main memory according to a cache protocol and sends the acknowledgment to the external unit in response to the set-dirty request. The acknowledgment will either grant permission or deny permission to set the block to the dirty state.
摘要:
A multiprocessor system includes a plurality of processors, each processor having one or more caches local to the processor, and a memory controller connectable to the plurality of processors and a main memory. The memory controller manages the caches and the main memory of the multiprocessor system. A processor of the multiprocessor system is configurable to evict from its cache a block of data. The selected block may have a clean coherence state or a dirty coherence state. The processor communicates a notify signal indicating eviction of the selected block to the memory controller. In addition to sending a write victim notify signal if the selected block has a dirty coherence state, the processor sends a clean victim notify signal if the selected block has a clean coherence state.
摘要:
A processor of a multiprocessor system is configured to transmit a full probe to a cache associated with the processor to transfer data from the stored data of the cache. The data corresponding to the full probe is transferred during a time period. A first tag-only probe is also transmitted to the cache during the same time period to determine if the data corresponding to the tag-only probe is part of the stored data stored in the cache. A stream of probes accesses the cache in two stages. The cache is composed of a tag structure and a data structure. In the first stage, a probe is designated a tag-only probe and accesses the tag structure, but not the data structure, to determine tag information indicating a hit or a miss. In the second stage, if the probe returns tag information indicating a cache hit the probe is designated to be a full probe and accesses the data structure of the cache. If the probe returns tag information indicating a cache miss the probe does not proceed to the second stage.
摘要:
A computing apparatus has a mode selector configured to select one of a long-bus mode corresponding to a first memory size and a short-bus mode corresponding to a second memory size which is less than the first memory size. An address bus of the computing apparatus is configured to transmit an address consisting of address bits defining the first memory size and a subset of the address bits defining the second memory size. The address bus has N communication lines each configured to transmit one of a first number of bits of the address bits defining the first memory size in the long-bus mode and M of the N communication lines each configured to transmit one of a second number of bits of the address bits defining the second memory size in the short-bus mode, where M is less than N.
摘要:
According to the present invention a cache within a multiprocessor system is speculatively filled. To speculatively fill a designated cache, the present invention first determines an address which identifies information located in a main memory. The address may also identify one or more other versions of the information located in one or more caches. The process of filling the designated cache with the information is started by locating the information in the main memory and locating other versions of the information identified by the address in the caches. The validity of the information located in the main memory is determined after locating the other versions of the information. The process of filling the designated cache with the information located in the main memory is initiated before determining the validity of the information located in main memory. Thus, the memory reference is speculative.