摘要:
The inventive cache uses a queuing structure which provides out-of-order cache memory access support for multiple accesses, as well as support for managing bank conflicts and address conflicts. The inventive cache can support four data accesses that are hits per clocks, support one access that misses the L1 cache every clock, and support one instruction access every clock. The responses are interspersed in the pipeline, so that conflicts in the queue are minimized. Non-conflicting accesses are not inhibited, however, conflicting accesses are held up until the conflict clears. The inventive cache provides out-of-order support after the retirement stage of a pipeline.
摘要:
The inventive cache manages address conflicts and maintains program order without using a store buffer. The cache utilizes an issue algorithm to insure that accesses issued in the same clock are actually issued in an order that is consistent with program order. This is enabled by performing address comparisons prior to insertion of the accesses into the queue. Additionally, when accesses are separated by one or more clocks, address comparisons are performed, and accesses that would get data from the cache memory array before a prior update has actually updated the cache memory in the array are canceled. This provides a guarantee that program order is maintained, as an access is not allowed to complete until it is assured that the most recent data will be received upon access of the array.
摘要:
The inventive cache processes multiple access requests simultaneously by using separate queuing structures for data and instructions. The inventive cache uses ordering mechanisms that guarantee program order when there are address conflicts and architectural ordering requirements. The queuing structures are snoopable by other processors of a multiprocessor system. The inventive cache has a tag access bypass around the queuing structures, to allow for speculative checking by other levels of cache and for lower latency if the queues are empty. The inventive cache allows for at least four accesses to be processed simultaneously. The results of the access can be sent to multiple consumers. The multiported nature of the inventive cache allows for a very high bandwidth to be processed through this cache with a low latency.
摘要:
A system and method are disclosed which provide a cache structure that allows early access to the cache structure's data. A cache design is disclosed that, in response to receiving a memory access request, begins an access to a cache level's data before a determination has been made as to whether a true hit has been achieved for such cache level. That is, a cache design is disclosed that enables cache data to be speculatively accessed before a determination is made as to whether a memory address required to satisfy a received memory access request is truly present in the cache. In a preferred embodiment, the cache is implemented to make a determination as to whether a memory address required to satisfy a received memory access request is truly present in the cache structure (i.e., whether a “true” cache hit is achieved). Although, such a determination is not made before the cache data begins to be accessed. Rather, in a preferred embodiment, a determination of whether a true cache hit is achieved in the cache structure is performed in parallel with the access of the cache structure's data. Therefore, a preferred embodiment implements a parallel path by beginning the cache data access while a determination is being made as to whether a true cache hit has been achieved. Thus, the cache data is retrieved early from the cache structure and is available in a timely manner for use by a requesting execution unit.
摘要:
A multi-level cache structure and associated method of operating the cache structure are disclosed. The cache structure uses a queue for holding address information for a plurality of memory access requests as a plurality of entries. The queue includes issuing logic for determining which entries should be issued. The issuing logic further comprises find first logic for determining which entries meet a predetermined criteria and selecting a plurality of those entries as issuing entries. The issuing logic also comprises lost logic that delays the issuing of a selected entry for a predetermined time period based upon a delay criteria. The delay criteria may, for example, comprise a conflict between issuing resources, such as ports. Thus, in response to an issuing entry being oversubscribed, the issuing of such entry may be delayed for a predetermined time period (e.g., one clock cycle) to allow the resource conflict to clear.
摘要:
The present invention provides a precharge circuit that has a first precharged node, a second precharged node, and a latch device. The first precharged node is charged to a high value during a precharge state. In response to a transition from the precharge state to an evaluate state, it either discharges to a low value or remains charged at its high value. The second precharged node has a value in the evaluate state that is based on the value of the first precharged node upon the circuit transitioning to the evaluate state. The latch device is connected to the second precharged node for latching this value in the evaluate state. With the latching device, this value is not affected by the first precharged node once the circuit has sufficiently transitioned to the evaluate state.
摘要:
The present invention provides a precharge circuit that has a first precharged node, a second precharged node, and a latch device. The first precharged node is charged to a high value during a precharge state. In response to a transition from the precharge state to an evaluate state, it either discharges to a low value or remains charged at its high value. The second precharged node has a value in the evaluate state that is based on the value of the first precharged node upon the circuit transitioning to the evaluate state. The latch device is connected to the second precharged node for latching this value in the evaluate state. With the latching device, this value is not affected by the first precharged node once the circuit has sufficiently transitioned to the evaluate state.
摘要:
A system and method are provided which enable a data carrier, such as a BIT line, to be held to a desired value while performing a memory access (e.g., a read or write operation) of SRAM in an efficient manner. In a preferred embodiment, cross-coupled PFETs are implemented to hold the BIT line to a desired value during a memory access of SRAM. As a result, a preferred embodiment enables a BIT line to transition from a high voltage value to a low voltage value free from conflict. That is, in a preferred embodiment, a holder PFET is not attempting to hold the BIT line high, while the SRAM or outside source (e.g., a “writing source”) is attempting to drive the BIT line to a low voltage value. Also, in a preferred embodiment, the BIT and NBIT lines (i.e., a complementary data carrier) can be driven to “true” low and “true” high voltage values. Accordingly, in a preferred embodiment, complex circuitry, such as a sense amp, is not required to detect whether a value on the lines is a logic 0 or logic 1. Therefore, a preferred embodiment enables memory access requests (e.g., read and write operations) to be serviced in a more timely manner than is achieved utilizing prior art implementations. Furthermore, a preferred embodiment requires less power consumption than is required for prior art implementations. Moreover, a preferred embodiment utilizes fewer components, and therefore consumes less surface area than in prior art implementations.
摘要:
The present invention integrates a WWTM circuit with the write driver circuitry, which is an inherent part of any conventional SRAM design. Thus, a circuit for writing data into and weak write testing a memory cell is provided. In one embodiment, the circuit comprises a write driver that has an output for applying a write or a weak write output signal at the memory cell. The write driver has first and second selectable operating modes. In the first mode, the write driver is set to apply a weak write output signal from the output for performing a weak write test on the cell. In the second mode, the write driver is set to apply a normal write output signal that is sufficiently strong for writing a data value into the cell when it is healthy.
摘要:
One disclosed embodiment may comprise a design method for a dynamic circuit system. The method may include providing a design for a single stage network comprising a pull-down network that is configured to perform a desired logic function according to a plurality of inputs. The method may also include designing a multi-stage network that includes at least two stages, each of the at least two stages including a pull-down network that receives a respective portion of the plurality of inputs and each of the at least two stages cooperating to perform the desired logic function.