摘要:
A system and method are disclosed which determine in parallel for multiple levels of a multi-level cache whether any one of such multiple levels is capable of satisfying a memory access request. Tags for multiple levels of a multi-level cache are accessed in parallel to determine whether the address for a memory access request is contained within any of the multiple levels. For instance, in a preferred embodiment, the tags for the first level of cache and the tags for the second level of cache are accessed in parallel. Also, additional levels of cache tags up to N levels may be accessed in parallel with the first-level cache tags. Thus, by the end of the access of the first-level cache tags it is known whether a memory access request can be satisfied by the first-level, second-level, or any additional N-levels of cache that are accessed in parallel. Additionally, in a preferred embodiment, the multi-level cache is arranged such that the data array of a level of cache is accessed only if it is determined that such level of cache is capable of satisfying a received memory access request. Additionally, in a preferred embodiment the multi-level cache is partitioned into N ways of associativity, and only a single way of a data array is accessed to satisfy a memory access request, thereby preserving the remaining ways of a data array to save power and resources that may be accessed to satisfy other instructions.
摘要:
The inventive cache uses a queuing structure which provides out-of-order cache memory access support for multiple accesses, as well as support for managing bank conflicts and address conflicts. The inventive cache can support four data accesses that are hits per clocks, support one access that misses the L1 cache every clock, and support one instruction access every clock. The responses are interspersed in the pipeline, so that conflicts in the queue are minimized. Non-conflicting accesses are not inhibited, however, conflicting accesses are held up until the conflict clears. The inventive cache provides out-of-order support after the retirement stage of a pipeline.
摘要:
A method, and a corresponding apparatus, mask error detection and correction latency during multilevel cache transfers. The method includes the steps of transferring error protection encoded data lines from a first cache, checking the error protection encoded data lines for errors, wherein the checking is completed after the transferring begins, receiving the error protection encoded data lines in a second cache, and upon detecting an error in a data line, preventing further transfer of the data line from the second cache.
摘要:
Methods and apparatus mask the latency of error detection and/or error correction applied to data transferred between a first memory and a second memory. The method comprises determining whether there is an error in a data unit in the first memory; transferring data based on the data unit from the first memory to a second memory, wherein the transferring step commences before completion of the determining step; and disabling at least part of the second memory if the determining step detects an error in the data unit. The disabling step may be accomplished, for example, by disabling the buffering of an address of the data unit or stalling the second memory.
摘要:
The inventive cache processes multiple access requests simultaneously by using separate queuing structures for data and instructions. The inventive cache uses ordering mechanisms that guarantee program order when there are address conflicts and architectural ordering requirements. The queuing structures are snoopable by other processors of a multiprocessor system. The inventive cache has a tag access bypass around the queuing structures, to allow for speculative checking by other levels of cache and for lower latency if the queues are empty. The inventive cache allows for at least four accesses to be processed simultaneously. The results of the access can be sent to multiple consumers. The multiported nature of the inventive cache allows for a very high bandwidth to be processed through this cache with a low latency.
摘要:
A method and apparatus consolidate ports on a unified cache. The apparatus uses plurality of access connections with a single port of a memory. The apparatus comprises multiplexor and a logic circuit. The multiplexor is connected to the plurality of access connections. The multiplexor has a control input and a memory connection. The logic circuit produces an output signal tied to the control input. In another form, the apparatus comprises means for selectively coupling a single one of the plurality of access connections to the memory, and a means for controlling the means for coupling. Preferably, the plurality of access connections comprise a data connection and an instruction connection, and the memory is cache memory. The method uses a single memory access connection for a plurality of access types. The method accepts one or more memory access requests on one or more respective ones of a plurality of connections. If there are memory access requests simultaneously active on two or more of the plurality of connections, then the method selects one of the simultaneously active connections and connects the selected connection to the single memory access connection.
摘要:
A multi-level cache structure and associated method of operating the cache structure are disclosed. The cache structure uses a queue for holding address information for a plurality of memory access requests as a plurality of entries. The queue includes issuing logic for determining which entries should be issued. The issuing logic further comprises find first logic for determining which entries meet a predetermined criteria and selecting a plurality of those entries as issuing entries. The issuing logic also comprises lost logic that delays the issuing of a selected entry for a predetermined time period based upon a delay criteria. The delay criteria may, for example, comprise a conflict between issuing resources, such as ports. Thus, in response to an issuing entry being oversubscribed, the issuing of such entry may be delayed for a predetermined time period (e.g., one clock cycle) to allow the resource conflict to clear.
摘要:
A simplified semaphore method and apparatus for simultaneous execution of multiple semaphore instructions and for enforcement of necessary ordering. A central processing unit having an instruction pipeline is coupled with a data cache arrangement including a semaphore buffer, a data cache, and the semaphore execution unit. An initial semaphore instruction having one or more operands and a semaphore address are transmitted from the instruction pipeline to the semaphore buffer, which in turn are transmitted from the semaphore buffer to the semaphore execution unit. The semaphore address of the initial semaphore instruction is transmitted from the instruction pipeline to the data cache to retrieve initial semaphore data stored within the data cache at a location in a data line of the data cache as identified by the semaphore address. The semaphore instruction is executed within the semaphore execution unit by operating upon the initial semaphore data and the one or more semaphore operands so as to produce processed semaphore data, which is then stored within the data cache. Since the semaphore buffer provides for entries of multiple semaphore instructions, the semaphore buffer initiates simultaneous execution of multiple semaphore instructions, as needed.
摘要:
A method of generating an issue pointer for issuing data structures from a queue, comprising generating a signal that indicates where one or more of the data structures within the queue that desire to issue are located within the queue. Then, checking the signal at a queue location pointed to by an issue pointer. Then, incrementing the position of the issue pointer if a data structure has not shifted into the queue location since the previous issue and if the issue pointer is pointing to the location having issued on the previous queue issue or holding the issue pointer position if a data structure has shifted into the location since the previous issue and if the issue pointer is pointing to the location having issued on the previous queue issue.
摘要:
The inventive cache manages address conflicts and maintains program order without using a store buffer. The cache utilizes an issue algorithm to insure that accesses issued in the same clock are actually issued in an order that is consistent with program order. This is enabled by performing address comparisons prior to insertion of the accesses into the queue. Additionally, when accesses are separated by one or more clocks, address comparisons are performed, and accesses that would get data from the cache memory array before a prior update has actually updated the cache memory in the array are canceled. This provides a guarantee that program order is maintained, as an access is not allowed to complete until it is assured that the most recent data will be received upon access of the array.