摘要:
An improved I/O processor (IOP) delivers high I/O performance while maintaining inter-reference ordering among memory reference operations issued by an I/O device as specified by a consistency model in a shared memory multiprocessor system. The IOP comprises a retire controller which imposes inter-reference ordering among the operations based on receipt of a commit signal for each operation, wherein the commit signal for a memory reference operation indicates the apparent completion of the operation rather than actual completion of the operation. In addition, the IOP comprises a prefetch controller coupled to an I/O cache for prefetching data into cache without any ordering constraints (or out-of-order). The ordered retirement functions of the IOP are separated from its prefetching operations, which enables the latter operations to be performed in an arbitrary manner so as to improve the overall performance of the system.
摘要:
A technique selectively imposes inter-reference ordering between memory reference operations issued by a processor of a multiprocessor system to addresses within a page pertaining to a page table entry (PTE) that is affected by a translation buffer (TB) miss flow routine. The TB miss flow is used to retrieve information contained in the PTE for mapping a virtual address to a physical address and, subsequently, to allow retrieval of data at the mapped physical address. The PTE that is retrieved in response to a memory reference (read) operation is not loaded into the TB until a commit-signal associated with that read operation is returned to the processor. Once the PTE and associated commit-signal are returned, the processor loads the PTE into the TB so that it can be used for a subsequent read operation directed to the data at the physical address.
摘要:
A mechanism optimizes the generation of a commit-signal by control logic of the multiprocessor system in response to a memory reference operation issued by a processor to a local node of a multiprocessor system having a hierarchical switch for interconnecting a plurality of nodes. The mechanism generally comprises a structure that indicates whether the memory reference operation affects other processors of other nodes of the multiprocessor system. An ordering point of the local node generates an optimized commit-signal when the structure indicates that the memory reference operation does not affect the other processors.
摘要:
A technique reduces the latency of a memory barrier (MB) operation used to impose an inter-reference order between sets of memory reference operations issued by a processor to a multiprocessor system having a shared memory. The technique comprises issuing the MB operation immediately after issuing a first set of memory reference operations (i.e., the pre-MB operations) without waiting for responses to those pre-MB operations. Issuance of the MB operation to the system results in serialization of that operation and generation of a MB Acknowledgment (MB-Ack) command. The MB-Ack is loaded into a probe queue of the issuing processor and, according to the invention, functions to pull-in all previously ordered invalidate and probe commands in that queue. By ensuring that the probes and invalidates are ordered before the MB-Ack is received at the issuing processor, the inventive technique provides the appearance that all pre-MB references have completed.
摘要:
A technique reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory that is distributed among a plurality of processors that share a cache. According to the technique, each processor sharing a cache inherits a commit-signal that is generated by control logic of the multiprocessor system in response to a memory reference operation issued by another processor sharing that cache. The commit-signal facilitates serialization among the processors and shared memory entities of the multiprocessor system by indicating the apparent completion of the memory reference operation to those entities of the system.
摘要:
A mechanism reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory. The mechanism comprises a commit-signal that is generated by control logic of the multiprocessor system in response to an issued memory reference operation. The commit-signal facilitates inter-reference ordering; moreover, the commit signal indicates the apparent completion of the memory reference operation, rather than actual completion of the operation. The apparent completion of an operation occurs substantially sooner than the actual completion of an operation, thereby improving performance of the multiprocessor system.
摘要:
A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction. The processor executes the store conditional instruction, and the processor in response to executing the store conditional instruction tests the lock flag, and if the lock flag is set, writing to the cache copy of the memory block. The processor ends, in the event that the lock flag is reset, the store conditional instruction and does not write to the cache copy of the memory block.
摘要:
A multiple-processor system in which a commit message is returned to a source processor that requests a memory access operation so as to indicate the apparent completion of the operation includes a multiple-level switch unit linking nodes that contain the processors. The switch unit includes multiple input switches each of which receives messages from multiple nodes, and a set of output switches whose inputs are the outputs of the input switches and whose outputs are the inputs of the nodes. Each switch processes messages in the order in which they are received by the switch and each output switch follows the same rule as the other output switches.
摘要:
One disclosed embodiment may comprise a system that includes a home node that provides a transaction reference to a requester in response to a request from the requester. The requester provides an acknowledgement message to the home node in response to the transaction reference, the transaction reference enabling the requester to determine an order of requests at the home node relative to the request from the requester.
摘要:
A system comprises a first node including data having an associated D-state and a second node operative to provide a source broadcast requesting the data. The first node is operative in response to the source broadcast to provide the data to the second node and transition the state associated with the data at the first node from the D-state to an O-state without concurrently updating memory. An S-state is associated with the data at the second node.