摘要:
A number of data misalignment detection circuits are provided to select ones of an execution unit and memory order interface components, of an out-of-order (OOO) execution computer system, this aids the buffering and fault generation circuits of the memory order interface components, to buffer and dispatch load and store operations depending on if the misalignments are detected and their nature, resulting in load and store operations of misaligned data against a memory subsystem of the OOO execution computer system are available, the data addressed by the load and store operations are of the following types: chunk split, cache split, or page split misaligned.
摘要:
Register identification preservation in a microprocessor implementing register renaming. Multiplexing and control circuitry are implemented for manipulating data sources to be supplied to a microprocessor's functional units. The circuitry will generate zero extending for source data to an execution unit where a data source register specified is shorter than a general register size utilized by the microprocessor. Similarly, the multiplexing and control circuitry will shift bits of data from one location to another upon a source input to a functional unit in accordance with control signals designating such activity.
摘要:
The data cache unit includes a separate fill buffer and a separate write-back buffer. The fill buffer stores one or more cache lines for transference into data cache banks of the data cache unit. The write-back buffer stores a single cache line evicted from the data cache banks prior to write-back to main memory. Circuitry is provided for transferring a cache line from the fill buffer into the data cache banks while simultaneously transferring a victim cache line from the data cache banks into the write-back buffer. Such allows the overall replace operation to be performed in only a single clock cycle. In a particular implementation, the data cache unit is employed within a microprocessor capable of speculative and out-of-order processing of memory instructions. Moreover, the microprocessor is incorporated within a multiprocessor computer system wherein each microprocessor is capable of snooping the cache lines of data cache units of each other microprocessor. The data cache unit is also a non-blocking cache.
摘要:
A method and apparatus for allocating a number of vacant entries of a buffer resource and generating a set of enable vectors based thereon for a set of issued instructions. A deallocation vector of a reservation station is searched in order to locate, within one clock cycle, the vacancies within the reservation station for storage of instruction information associated with several issued operations. Vacancies are indicated by bits of the deallocation vector. A general static and dynamic approach are disclosed for performing the vacant entry identification at high speed within a single clock cycle. Alternate embodiments are disclosed, based on the general approach, that divide the deallocation vector into separate portions (consecutive bits or interleaved) and process each portion based on the general approaches. Rotating priority reference points within the deallocation vector may be used to vary the starting point for vacancy location. Further, the vacancy search can be limited to finding only consecutive vacancies. A superscalar microprocessor using the above may, within one clock cycle, schedule a group of instructions from the instruction decoder to the reservation station for subsequent execution.
摘要:
An apparatus for protecting the data in a local register cache during calls and returns that cross subsystem boundaries. The left most bit of a shift register is set to a 1 if a "set boundary bit" instruction is detected. A subsequent PUSH instruction shifts the shift register right one bit. A POPTOS1 instruction in the instruction flow signifies an intra-subsystem return and causes the leftmost bit of the shift register to be checked to see if it is a zero. A protection fault is signalled upon the condition that the leftmost bit is not equal to zero. The shift register is shifted left one bit upon the condition that a POPSTOS 2 instruction is detected. A POPSUB1 instruction detected in the instruction flow signifies an inter-subsystem return and causes the leftmost bit of the shift register to be checked to see if it is a one. A a protection fault is signalled upon the condition that the leftmost bit is not equal to zero. The register is shifted left one bit upon the condition that a POPSTOS 2 instruction is detected in the instruction flow.
摘要:
A number of identical matching circuits are integrated into the store address buffer, one matching circuit to each buffer slot, for generating a number of match signals, one for each detected match, using at most the entire source address of an instruction being fetched and the corresponding portions of the store destination addresses of the buffered store instructions. Additionally, a stall signal generator complimentary to the store address buffer is provided for generating a single stall signal for the bus controller, using the match signals, thereby stalling an instruction fetch from a source address that is potentially a store destination of one of the buffered store instructions with minimal performance cost.
摘要:
An instruction fetch unit in which an early instruction fetch is initiated to access a main memory simultaneously with checking a cache for the desired instruction. On a slow path to main memory is a large main translation lookaside buffer (TLB) that holds address translations. On a fast path is a smaller translation write buffer (TWB), a mini-TLB, that holds recently used address translations. A guess fetch access in initiated by presenting an address to the main memory in parallel with presenting the address to the cache. The address is compared with the contents of the TWB for a hit and with the contents of the cache for a hit. The guess access is allowed to proceed upon the condition that there is a hit in the TWB (the TWB is able to translate the logical address into a physical address) and a miss in the I-cache (the data are not available in the I-cache and hence the guess access of main memory is necessary to get the data). The guess access is canceled upon the condition that there is either a miss in the TWB (the TWB is unable to translate the logical address into a physical address) or a hit in the I-cache (the data are available in the I-cache and hence the guess access of main memory is not necessary). In this case a fetch access is reissued on the "slow" path that goes through the large main TLB.
摘要:
A scbok line is connected to a register file and other units, such as an execution unit and a multiply/divide unit, in a data processing system. A mem scbok line is connected to the register file and other units, such as an instruction unit and a memory interface unit. Each unit connected to the scbok line can pull the line to indicate that it is busy. Each unit connected to the mem scbok line can pull the line to indicate that it is busy. The scbok line indicates, when asserted, that a unit or a register in the register file that is busy with a previous instruction is not available to an instruction for a register file operation. The mem scbok line indicates, when asserted, that a unit or a register in the register file that is busy with a previous instruction is not available to an instruction for a memory operation. Registers are checked concurrently with the issuing of an instruction. An instruction lacking any needed unit or a register is stopped in response to the asserted scbok line and reissued in the next cycle. Registers to be used by a multi-cycle instruction are marked busy for an instruction that is able to be executed. When a result for the multi-cycle instruction returns the registers previously marked busy are marked as not busy.
摘要:
A system includes a non-volatile random access memory (NVRAM) device and controller logic that detects a bad block within the device, retires the bad block and replaces the bad block with a replacement block by assigning the address of the bad block to the replacement block.
摘要:
Embodiments of the invention describe a system main memory comprising two levels of memory that include cached subsets of system disk level storage. This main memory includes “near memory” comprising memory made of volatile memory, and “far memory” comprising volatile or nonvolatile memory storage that is larger and slower than the near memory.The far memory is presented as “main memory” to the host OS while the near memory is a cache for the far memory that is transparent to the OS, thus appearing to the OS the same as prior art main memory solutions. The management of the two-level memory may be done by a combination of logic and modules executed via the host CPU. Near memory may be coupled to the host system CPU via high bandwidth, low latency means for efficient processing. Far memory may be coupled to the CPU via low bandwidth, high latency means.