摘要:
A processing unit for a multiprocessor data processing system includes a store-through upper level cache, an instruction sequencing unit that fetches instructions for execution, at least one instruction execution unit that executes a store-conditional instruction to determine a store target address, a store queue that, following execution of the store-conditional instruction, buffers a corresponding store operation, sequencer logic associated with the store queue. The sequencer logic, responsive to receipt of a latency indication indicating that resolution of the store-conditional operation as passing or failing is subject to significant latency, invalidates, prior to resolution of the store-conditional operation, a cache line in the store-through upper level cache to which a load-reserve operation previously bound.
摘要:
A technique for triggering a system bus write command with user code includes identifying a specific store-type instruction in a user instruction sequence. The specific store-type instruction is converted into a specific request-type command, which is configured to include core permission controls (that are stored in core configuration registers of a processor core by a trusted kernel) and user created data (stored in a cache memory). Slave devices are configured through register space (that is only accessible by the trusted kernel) with respective slave permission controls. The specific request-type command is then transmitted from the cache memory, via a system bus. In this case, the slave devices that receive the specific request-type command (via the system bus) process the specific request-type command when the core permission controls are the same as the respective slave permission controls.
摘要:
A method and computer system for reducing the wiring congestion, required real estate, and access latency in a cache subsystem with a sectored and sliced lower cache by re-configuring sector-to-slice allocation and the lower cache addressing scheme. With this allocation, sectors having discontiguous addresses are placed within the same slice, and a reduced-wiring scheme is possible between two levels of lower caches based on this re-assignment of the addressable sectors within the cache slices. Additionally, the lower cache effective address tag is re-configured such that the address fields previously allocated to identifying the sector and the slice are switched relative to each other's location within the address tag. This re-allocation of the address bits enables direct slice addressing based on the indicated sector.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory and a second cache memory, and the second coherency domain includes a remote coherent cache memory. The first cache memory includes a cache controller, a data array including a data storage location for caching a memory block, and a cache directory. The cache directory includes a tag field for storing an address tag in association with the memory block and a coherency state field associated with the tag field and the data storage location. The coherency state field has a plurality of possible states including a state that indicates that the memory block is possibly shared with the second cache memory in the first coherency domain and cached only within the first coherency domain.
摘要:
A method, system, and device for enabling intervention across same-level cache memories. In a preferred embodiment, responsive to a cache miss in a first cache memory a direct intervention request is sent from the first cache memory to a second cache memory requesting a direct intervention that satisfies the cache miss.
摘要:
A system and method for re-ordering store operations from a processor core to a store queue. When a store queue receives a new processor-issued store operation from the processor core, a store queue controller allocates a new entry in the store queue. In response to allocating the new entry in the store queue, the store queue controller determines whether or not the new entry is dependent on at least one other valid entry in the store queue. In response to determining the new entry is dependent on at least one other valid entry in the store queue, the store queue controller inhibits requesting of the new entry to the RC dispatch logic until each valid entry on which the new entry is dependent has been successfully dispatched to an RC machine by the RC dispatch logic.
摘要:
A technique for maintaining input/output (I/O) command ordering on a bus includes assigning a channel identifier to I/O commands of an I/O stream. In this case, the channel identifier indicates the I/O commands belong to the I/O stream. A command location indicator is assigned to each of the I/O commands. The command location indicator provides an indication of which one of the I/O commands is a start command in the I/O stream and which of the I/O commands are continue commands in the I/O stream. The I/O commands are issued in a desired completion order. When a first one of the I/O commands does not complete successfully, the I/O commands in the I/O stream are reissued on the bus starting at the first one of the I/O commands that did not complete successfully.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory and a second cache memory, and the second coherency domain includes a remote coherent cache memory. The first cache memory includes a cache controller, a data array including a data storage location for caching a memory block, and a cache directory. The cache directory includes a tag field for storing an address tag in association with the memory block and a coherency state field associated with the tag field and the data storage location. The coherency state field has a plurality of possible states including a state that indicates that the memory block is possibly shared with the second cache memory in the first coherency domain and cached only within the first coherency domain.
摘要:
A processing unit for a multiprocessor data processing system includes a processor core including a store-through upper level cache, an instruction sequencing unit that fetches instructions for execution, a data register, and at least one instruction execution unit. The instruction execution unit, responsive to receipt of a load-reserve instruction from the instruction sequencing unit, executes the load-reserve instruction to determine a load target address. The processor core, responsive to the execution of the load-reserve instruction, performs a corresponding load-reserve operation by accessing the store-through upper level cache utilizing the load target address to cause data associated with the load target address to be loaded from the store-through upper level cache into the data register and by establishing a reservation for a reservation granule including the load target address.
摘要:
A technique for maintaining input/output (I/O) command ordering on a bus includes assigning a channel identifier to I/O commands of an I/O stream. In this case, the channel identifier indicates the I/O commands belong to the I/O stream. A command location indicator is assigned to each of the I/O commands. The command location indicator provides an indication of which one of the I/O commands is a start command in the I/O stream and which of the I/O commands are continue commands in the I/O stream. The I/O commands are issued in a desired completion order. When a first one of the I/O commands does not complete successfully, the I/O commands in the I/O stream are reissued on the bus starting at the first one of the I/O commands that did not complete successfully.