摘要:
Method and system for a multi-level virtual/real cache system with synonym resolution. An exemplary embodiment includes a multi-level cache hierarchy, including a set of L1 caches associated with one or more processor cores and a set of L2 caches, wherein the set of L1 caches are a subset of the set of L2 caches, wherein the set of L1 caches underneath a given L2 cache are associated with one or more of the processor cores.
摘要:
There is provided a method, system and computer program product for generating trace data related to a data processing system event. The method includes: receiving an instruction relating to the system event from a location in the system; determining a minimum number of trace segment records required to record instruction information; and creating a trace segment table including the number of trace segment records, the number of trace segment records including at least one instruction record.
摘要:
A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction.
摘要:
A method for handling cache coherency includes allocating a tag when a cache line is not exclusive in a data cache for a store operation, and sending the tag and an exclusive fetch for the line to coherency logic. An invalidation request is sent within a minimum amount of time to an I-cache, preferably only if it has fetched to the line and has not been invalidated since, which request includes an address to be invalidated, the tag, and an indicator specifying the line is for a PSC operation. The method further includes comparing the request address against stored addresses of prefetched instructions, and in response to a match, sending a match indicator and the tag to an LSU, within a maximum amount of time. The match indicator is timed, relative to exclusive data return, such that the LSU can discard prefetched instructions following execution of the store operation that stores to a line subject to an exclusive data return, and for which the match is indicated.
摘要:
A method and system for implementing store buffer allocation for variable length store data operations are provided. The method includes receiving a store address request and at least one store data request and stepping through data operations for each of the store data requests and an address range for the store data requests to determine alignment and data steering information used to select a storage buffer destination for the data in the store data requests. The method further includes determining availability of the storage buffer by maintaining a reservation list for each storage buffer, maintaining a count of the number of available entries for each storage buffer, updating the reservation list to reflect a reservation acceptance for designated available entries, and clearing entries upon completion of the processing of store data operations. The method also includes reserving the selected storage buffer when the number of available entries meets or exceeds the number of entries required for the data.
摘要:
A central processor uses virtual addresses to access data via cache logic including a DAT and ART, and the cache logic accesses data in the hierarchical storage subsystem using absolute addresses to access data, a part of the first level of the cache memory includes a translator for virtual or real addresses to absolute addresses. When requests are sent for a data fetch and the requested data are not resident in the first level of cache the request for data is delayed and may be forwarded to a lower level of said hierarchical memory, and a delayed request may result in cancellation of any process during a delayed request that has the ability to send back an exception. A delayed request may be rescinded if the central processor has reached an interruptible stage in its pipeline logic at which point, a false exception is forced clearing all I the wait states while the central processor ignores the false exception. Forcing of an exception occurs during dynamic address translation (DAT) or during access register translation (ART). A request for data signal to the storage subsystem cancellation is settable by the first hierarchical level of cache logic. A false exception signal to the first level cache is settable by the storage subsystem logic.
摘要:
A pipelined processor including one or more units having storage locations not directly accessible by software instructions. The processor includes a load-store unit (LSU) in direct communication with the one or more units for accessing the storage locations in response to special instructions. The processor also includes a requesting unit for receiving a special instruction from a requestor and a mechanism for performing a method. The method includes broadcasting storage location information from the special instruction to one or more of the units to determine a corresponding unit having the storage location specified by the special instruction. Execution of the special instruction is initiated at the corresponding unit. If the unit executing the special instruction is not the LSU, the data is sent to the LSU. The data is received from the LSU as a result of the execution of the special instruction. The data is provided to the requester.
摘要:
A pipelined microprocessor includes circuitry for store forwarding by performing: for each store request, and while a write to one of a cache and a memory is pending; obtaining the most recent value for at least one complete block of data; merging store data from the store request with the complete block of data thus updating the block of data and forming a new most recent value and an updated complete block of data; and buffering the updated complete block of data into a store data queue; for each load request, where the load request may require at least one updated completed block of data: determining if store forwarding is appropriate for the load request on a block-by-block basis; if store forwarding is appropriate, selecting an appropriate block of data from the store data queue on a block-by-block basis; and forwarding the selected block of data to the load request.
摘要:
A system, method, and computer program product for enhancing timeliness of cache memory prefetching in a processing system are provided. The system includes a stride pattern detector to detect a stride pattern for a stride size in an amount of bytes as a difference between successive cache accesses. The system also includes a confidence counter. The system further includes eager prefetching control logic for performing a method when the stride size is less than a cache line size. The method includes adjusting the confidence counter in response to the stride pattern detector detecting the stride pattern, comparing the confidence counter to a confidence threshold, and requesting a cache prefetch in response to the confidence counter reaching the confidence threshold. The system may also include selection logic to select between the eager prefetching control logic and standard stride prefetching control logic.
摘要:
A method, system, and computer program product for storing result data from an external device. The method includes receiving the result data from the external device, the receiving at a system. The result data is stored into a store data buffer. The store data buffer is utilized by the system to contain store data normally generated by the system. A special store instruction is executed to store the result data into a memory on the system. The special store instruction includes a store address. The executing includes performing an address calculation of the store address based on provided instruction information, and updating a memory location at the store address with contents of the store data buffer utilizing a data path utilized by the system to store data normally generated by the system.