摘要:
A central processor uses virtual addresses to access data via cache logic including a DAT and ART, and the cache logic accesses data in the hierarchical storage subsystem using absolute addresses to access data, a part of the first level of the cache memory includes a translator for virtual or real addresses to absolute addresses. When requests are sent for a data fetch and the requested data are not resident in the first level of cache the request for data is delayed and may be forwarded to a lower level of said hierarchical memory, and a delayed request may result in cancellation of any process during a delayed request that has the ability to send back an exception. A delayed request may be rescinded if the central processor has reached an interruptible stage in its pipeline logic at which point, a false exception is forced clearing all I the wait states while the central processor ignores the false exception. Forcing of an exception occurs during dynamic address translation (DAT) or during access register translation (ART). A request for data signal to the storage subsystem cancellation is settable by the first hierarchical level of cache logic. A false exception signal to the first level cache is settable by the storage subsystem logic.
摘要:
A method for handling cache coherency includes allocating a tag when a cache line is not exclusive in a data cache for a store operation, and sending the tag and an exclusive fetch for the line to coherency logic. An invalidation request is sent within a minimum amount of time to an I-cache, preferably only if it has fetched to the line and has not been invalidated since, which request includes an address to be invalidated, the tag, and an indicator specifying the line is for a PSC operation. The method further includes comparing the request address against stored addresses of prefetched instructions, and in response to a match, sending a match indicator and the tag to an LSU, within a maximum amount of time. The match indicator is timed, relative to exclusive data return, such that the LSU can discard prefetched instructions following execution of the store operation that stores to a line subject to an exclusive data return, and for which the match is indicated.
摘要:
A method, system, and computer program product for cross-invalidation handling in a multi-level private cache are provided. The system includes a processor. The processor includes a fetch address register logic in communication with a level 1 data cache, a level 1 instruction cache, a level 2 cache, and a higher level cache. The processor also includes a set of cross-invalidate snapshot counter implemented in the fetch address register. Each cross-invalidate snapshot counter tracks an amount of pending higher level cross-invalidations received before new data for the corresponding cache miss is returned from the higher-level cache. The processor also includes logic executing on the fetch address register for handling level 1 data cache misses and interfacing with the level 2 cache. In response to the new data, and upon determining that older cross-invalidations are pending, the new data is prevented from being used by the processor.
摘要:
A new method and apparatus have been taught for storing error information used for debugging as generated by the initial and subsequent error occurrences. In this invention, a register with several bit ranges is used to store error information. The first bit-range is allocated to the initial error information. If the total number of the errors exceeds the capacity of the register, the last error is kept in a last bit-range. This way, precious initial error information (as well as the last error information) will be available for debugging.
摘要:
Result and operand forwarding is provided between differently sized operands in a superscalar processor by grouping a first set of instructions for operand forwarding, and grouping a second set of instructions for result forwarding, the first set of instructions comprising a first source instruction having a first operand and a first dependent instruction having a second operand, the first dependent instruction depending from the first source instruction; the second set of instructions comprising a second source instruction having a third operand and a second dependent instruction having a fourth operand, the second dependent instruction depending from the second source instruction, performing operand forwarding by forwarding the first operand, either whole or in part, as it is being read to the first dependent instruction prior to execution; performing result forwarding by forwarding a result of the second source instruction, either whole or in part, to the second dependent instruction, after execution; wherein the operand forwarding is performed by executing the first source instruction together with the first dependent instruction; and wherein the result forwarding is performed by executing the second source instruction together with the second dependent instruction.
摘要:
A system and method for asynchronous dynamic millicode entry prediction in a processor are provided. The system includes a branch target buffer (BTB) to hold branch information. The branch information includes: a branch type indicating that the branch represents a millicode entry (mcentry) instruction targeting a millicode subroutine, and an instruction length code (ILC) associated with the mcentry instruction. The system also includes search logic to perform a method. The method includes locating a branch address in the BTB for the mcentry instruction targeting the millicode subroutine, and determining a return address to return from the millicode subroutine as a function of the an instruction address of the mcentry instruction and the ILC. The system further includes instruction fetch controls to fetch instructions of the millicode subroutine asynchronous to the search logic. The search logic may also operate asynchronous with respect to an instruction decode unit.
摘要:
A processor including a microarchitecture adapted for invalidating mapping of at least one logical address to at least one absolute address, includes: at least one translation lookaside buffer (TLB) and a plurality of copies thereof; logic for independent indexing of each copy of the TLB; a plurality of comparators, each comparator associated with a respective output of each TLB set output for each TLB port, wherein each of the comparators is adapted for identifying mappings for invalidation; and logic for invalidating each identified mapping. A method and a computer program product are provided.
摘要:
Data is accessed in a multi-level hierarchical memory system. A request for data is received, including a virtual address for accessing the data. A translation buffer is queried to obtain an absolute address corresponding to the virtual address. Responsive to the translation buffer not containing an absolute address corresponding to the virtual address, the absolute address is obtained from a translation unit. A directory look-up is performed with the absolute address to determine whether a matching absolute address exists in the directory. A fetch request for the requested data is sent to a next level in the multi-level hierarchical memory system. Processing of the fetch request by the next level occurs in parallel with the directory lookup. The requested data is received in the primary cache to make the requested data available to be written to the primary cache.
摘要:
A method and system for instruction address parity comparison are provided. The method includes calculating an instruction address parity value for an instruction, and distributing the instruction address parity value to one or more functional units in processing circuitry. The method also includes receiving the distributed instruction address parity value from the one or more functional units, and calculating a completing instruction address (CIA) parity value associated with completing the instruction. The method further includes generating an error indicator in response to a mismatch between the received instruction address parity value and the CIA parity value.
摘要:
Embodiments of the invention include a method of synchronizing translation changes in a processor including a translation lookaside buffer, the method including setting a control bit to enable blocking of all fetch requests that miss the translation lookaside buffer without changing a translation state of the current process; if there is at least one pending translation, then waiting for completion of the at least one pending translation; and resetting the control bit. A processor and a computer program product are provided.