摘要:
A novel on-chip cache memory and method of operation are provided which increase microprocessor performance. The cache design allows two cache requests to be processed simultaneously (dual-ported) and concurrent cache requests to be in-flight (pipelined). The design of the cache allocates a first clock cycle to cache tag and data access and a second cycle is allocated to data manipulation. The memory array circuit design is simplified because the circuits are synchronized to the main processor clock and do not need to use self-timed circuits. The overall logic control scheme is simplified because distinct cycles are allocated to the cache functions.
摘要:
A fully-associative translation lookaside buffer structure for a computer system includes a first-level TLB0 memory having a plurality of entries and a second-level TLB1 memory operatively coupled to the first level TLB0 memory. The second-level TLB1 memory also has a plurality of entries. Entries are placed in the TLB0 and TLB1 structure as a result of software controlled translation register operations and hardware controlled translation cache operations. Logic controlling TLB0 treats both operations the same way and uses a hardware replacement algorithm to determine the entry index. Logic controlling TLB1 uses a hardware replacement algorithm to determine the entry index for translation cache entries, and use an index provided within the insertion instruction to determine the entry index for translation register operations.
摘要:
The inventive control logic provides the selection signals for a bi-endian rotator MUX. The logic determines the starting point for the data transfer by determining which input register byte is going to Byte 0 of the output register. The control logic passes the starting point to single decoder. The decoded value is then sent to a plurality of MUXs, one for each of the output register bytes. Each of the MUXs is prewired to receive a portion of bits of the decoded value, and the portion is arranged in a particular order. The MUXs then send their respective outputs to the rotator MUX as selection control signals.
摘要:
The inventive adder can perform carry look-ahead calculations for a bi-endian adder in a cache memory system. The adder can add one of +/−1, 4, 8, or 16 to a loaded value from memory, and the operation can be a 4 or 8 byte add. The inventive adder comprises a plurality of byte adder cells and carry look-ahead (CLA) logic. The adder cells determine which of themselves is the least significant bit (LSB) byte adder cell. The LSB cell then adds one of the increment values to its loaded value. The other cells add 0x00 or 0xFF, depending upon the sign of the increment value, to a loaded value from memory. Each adder performs two adds, one for a carry-in of 0, and the other for a carry in of 1. Both results are sent to a MUX. The CLA logic determines each of the carries, and provides a selection control signal to each MUX. of the different cells.
摘要:
An uncorrectable error is detected in the data of a computer system. The erroneous data is allowed to be stored in first and second caches of the computer system while the system runs first and second processes, the first process being associated with the data. The first process is terminated when an attempt is made to load the data from the cache. Meanwhile, the second process continues to run.
摘要:
Method and apparatus for storing and executing an instruction to load two independent registers with two values is disclosed. In one embodiment, a computer-readable medium is encoded with an instruction including an opcode field specifying that the instruction is an instruction to load two independent registers with a first value and a second value, a source field specifying the first value and the second value, a first target register field specifying a first target register to load with the first value; a second target register field specifying a second target register to load with the second value. A system to execute the instruction is also disclosed.
摘要:
A novel on-chip cache memory and method of operation are provided which increase microprocessor performance. The on-chip cache memory has two levels. The first level is optimized for low latency and the second level is optimized for capacity. Both levels of cache are pipelined and can support simultaneous dual port accesses. A queuing structure is provided between the first and second level of cache which is used to decouple the faster first level cache from the slower second level cache. The queuing structure is also dual ported. Both levels of cache support non-blocking behavior. When there is a cache miss at one level of cache, both caches can continue to process other cache hits and misses. The first level cache is optimized for integer data. The second level cache can store any data type including floating point. The novel two-level cache system of the present invention provides high performance which emphasizes throughput.
摘要:
An apparatus is described comprising a signal indicative of which of a plurality of data structures stored in a queue desire to issue from the queue. The apparatus also has a content addressable memory having a plurality of cells, where each of the cells is configured to store one of the data structures. The apparatus also has an output from at least one of the cells that is indicative of whether the data structure within the at least one of the cells has issued from the queue. The apparatus also has an input to the at least one of the cells coupled to the signal.
摘要:
A system and method are disclosed which provide a cache structure that allows early access to the cache structure's data. A cache design is disclosed that, in response to receiving a memory access request, begins an access to a cache level's data before a determination has been made as to whether a true hit has been achieved for such cache level. That is, a cache design is disclosed that enables cache data to be speculatively accessed before a determination is made as to whether a memory address required to satisfy a received memory access request is truly present in the cache. In a preferred embodiment, the cache is implemented to make a determination as to whether a memory address required to satisfy a received memory access request is truly present in the cache structure (i.e., whether a “true” cache hit is achieved). Although, such a determination is not made before the cache data begins to be accessed. Rather, in a preferred embodiment, a determination of whether a true cache hit is achieved in the cache structure is performed in parallel with the access of the cache structure's data. Therefore, a preferred embodiment implements a parallel path by beginning the cache data access while a determination is being made as to whether a true cache hit has been achieved. Thus, the cache data is retrieved early from the cache structure and is available in a timely manner for use by a requesting execution unit.
摘要:
An apparatus in a computer, called a pending access queue, for providing data for register load instructions after a cache miss. After a cache miss, when data is available for a register load instruction, the data is first directed to the pending access queue and is provided to an execution pipeline directly from the pending access queue, without requiring the data to be entered in the cache. Entries in the pending access queue include destination register identification, enabling injection of the data into the pipeline during intermediate pipeline phases. The pending access queue provides results to the requesting unit in any order needed, supporting out-of-order cache returns, and provides for arbitration when multiple sources have data ready to be processed. Each separate request to a single line is provided a separate entry, and each entry is provided with its appropriate part of the line as soon as the line is available, thereby rapidly providing data for multiple misses to a single line. The pending access queue may optionally include observability bits, enabling pending releases to complete execution before associated awaited data is present within the pending access queue. The pending access queue may optionally be used to include data to be stored by store instructions that have resulted in a cache miss.