摘要:
A method for controlling access to at least one external cache memory in a processing system, the at least one external cache memory having a number of lines of data and a number of bytes per line of data, the method includes determining a smallest cache memory size for use in the at least one external cache memory, and configuring a tag array of the at least one external cache memory to support the smallest determined cache memory size. A system for controlling access to at least one external cache memory in a processing system, the at least one external cache memory having a number of lines of data and a number of bytes per line of data, includes a circuit for configuring each tag field of a plurality of tag fields in a tag array in the at least one external cache memory to have a number of bits sufficient to support a smallest determined cache memory, and utilizing each tag field to determine whether data being accessed resides in the at least one external cache memory.
摘要:
A multi-processor data processing system has a multi-level cache wherein each processor has a split high level (e.g., level one or L1) cache composed of a data cache (DCache) and an instruction cache (ICache). A shared lower level (e.g., level two or L2) cache includes a cache array which is a superset of the cache lines in all L1 caches. There is a directory of L2 cache lines such that each line has a set of inclusion bits indicating if the line is residing in any of the L1 caches. A directory management system requires only N+2 inclusion bits per L2 line, where N is the number of processors having L1 caches sharing the L2 cache.
摘要:
An access hazard detection technique in a pipelined cache controller sustains high throughput in a frequently accessed cache but without the cost normally associated with such access hazard detection. If a previous request (request in the pipeline stages other than the first stage) has already resulted in a cache hit, and it matches the new request in both the Congruence Class Index and the Set Index fields and if the new request is also a hit, the address collision logic will signal a positive detection. This scheme makes use of the fact that (1) the hit condition, (2) the identical Congruence Class Index, and (3) the Set Index of two requests are sufficient to determine that they are referencing the same cache content. Implementation of this scheme results in a significant hardware saving and a significant performance boost.
摘要:
The present invention provides balanced cache performance in a data processing system. The data processing system includes a first processor, a second processor, a first cache memory, a second memory and a control circuit. The first processor is connected to the first cache memory, which serves as a first level cache for the first processor. The second processor and the first cache memory are connected to the second cache memory, which serves as a second level cache for the first processor and as a first level cache for the second processor. Replacement of a set in the second cache memory results in the set being invalidated in the first cache memory. The control circuit is connected to the second level cache and prevents replacing from a second level cache congruence class all sets that are in the first cache.
摘要:
A high performance shared cache is provided to support multiprocessor systems and allow maximum parallelism in accessing the cache by the processors, servicing one processor request in each machine cycle, reducing system response time and increasing system throughput. The shared cache of the present invention uses the additional performance optimization techniques of pipelining cache operations (loads and stores) and burst-mode data accesses. By including built-in pipeline stages, the cache is enabled to service one request every machine cycle from any processing element. This contributes to reduction in the system response time as well as the throughput. With regard to the burst-mode data accesses, the widest possible data out of the cache can be stored to, and retrieved from, the cache by one cache access operation. One portion of the data is held in logic in the cache (on the chip), while another portion (corresponding to the system bus width) gets transferred to the requesting element (processor or memory) in one cycle. The held portion of the data can then be transferred in the following machine cycle.
摘要:
A method is disclosed for executing a load instruction. Address information of the load instruction is used to generate an address of needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load instruction specifying the same address. If a previous load instruction specifying the same address is found, the cache hit signal is ignored and the load instruction is stored in the queue. A load/store unit, and a processor implementing the method, are also described.
摘要:
A method is disclosed for executing a load instruction. Address information of the load instruction is used to generate an address of needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load instruction specifying the same address. If a previous load instruction specifying the same address is found, the cache hit signal is ignored and the load instruction is stored in the queue. A load/store unit, and a processor implementing the method, are also described.
摘要:
A method is disclosed for executing a load instruction. Address information of the load instruction is used to generate an address of needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load instruction specifying the same address. If a previous load instruction specifying the same address is found, the cache hit signal is ignored and the load instruction is stored in the queue. A load/store unit, and a processor implementing the method, are also described.
摘要:
Methods for executing load instructions are disclosed. In one method, a load instruction and corresponding thread information are received. Address information of the load instruction is used to generate an address of the needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load and/or store instruction specifying the same address. If such a previous load/store instruction is found, the thread information is used to determine if the previous load/store instruction is from the same thread. If the previous load/store instruction is from the same thread, the cache hit signal is ignored, and the load instruction is stored in the queue. A load/store unit is also described.