摘要:
A data processing system includes one or more processing units, a memory subsystem, and one or more input/output channel controllers, wherein each of the input/output channel controllers include the capability of speculative input/output execution. The speculative I/O execution technique according to the present invention may include several options. The speculative execution in the IOCC begins after receiving a raw address even though the operation can still be remotely retried. The programmed I/O latency time is reduced significantly due to the early speculative commencement of the IOCC operation. The IOCC may have to abort the speculative operation if a remote flow control retry is received. If, however, no retry is received then significant time is saved because the speculative operation proceeds.
摘要:
A system and method dynamically changes the snoop comparison granularity between a sector and a page, depending upon the state (active or inactive) of a direct memory access (DMA) I/O device which is writing to a device on the system bus asynchronously when compared to the CPU clock. By using page address granularity, erroneous snoop hits will not occur, since potentially invalid sector addresses are not used during the snoop comparison. Sector memory addresses may be in a transition state at the time when the CPU clock determines a snoop comparison is to occur, because this sector address has been requested by a device operating asynchronously with the CPU clock. Once the asynchronous device becomes inactive the system dynamically returns to a page and sector address snoop comparison granularity.
摘要:
A cache for use with input/output devices attached to an input/output bus. Requests for access to system memory by an input/output device pass through the cache. Virtual memory addresses used by the input/output devices are translated into real addresses in the system memory. Virtual memory can be partitioned, with some virtual addresses being mapped to a second memory attached to the input/output bus.
摘要:
A technique for performing cache injection includes monitoring, at a host fabric interface, snoop responses to an address on a bus. When the snoop responses indicate a data block associated with the address is in a shared state, input/output data associated with the address on the bus is directed to a cache that includes the data block in the shared state and is located physically closer to the host fabric interface than one or more other caches that include the data block associated with the address in the shared state.
摘要:
A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. The set of helper thread binaries and the set of main thread binaries are partitioned according to common instruction boundaries. As a first partition in the set of main thread binaries executes within a first core, a second partition in the set of helper thread binaries executes within a second core, thus “warming up” the cache in the second core. When the first partition of the main completes execution, a second partition of the main core moves to the second core, and executes using the warmed up cache in the second core.
摘要:
A mechanism is provided for managing a process-to-process communication request. A call is received in an operating system from an application in the data processing system. The operating system passes the call to a host fabric interface controller in the data processing system without processing the call. The host fabric interface controller processes the call using state information associated with the call. The call is processed by the host fabric interface controller without intervention by the operating system.
摘要:
Mechanisms are provided for processing streaming data at high sustained data rates. These mechanisms receive a plurality of data elements over a plurality of non-sequential communication channels and write the plurality of data elements directly to the file system of the data processing system in an unassembled manner. The mechanisms determining whether to perform a data scrubbing operation or not based on history information indicative of whether data elements in the plurality of data elements are being received in a substantially sequential manner. The mechanisms perform a data scrubbing operation, in response to a determination to perform data scrubbing, to identify any missing data elements in the plurality of data elements written to the file system and assemble the plurality of data elements into a plurality of data streams in response to results of the data scrubbing indicating that there are no missing data elements.
摘要:
A method and a data processing system by which population count (popcount) operations are efficiently performed without incurring the latency and loss of critical processing cycles and bandwidth of real time processing. The method comprises: identifying data to be stored to memory for which a popcount may need to be determined; speculatively performing a popcount operation on the data as a background process of the processor while the data is being stored to memory; storing the data to a first memory location; and storing a value of the popcount generated by the popcount operation within a second memory location. The method further comprises: determining a size of data; determining a granular level at which the popcount operation on the data will be performed; and reserving a size of said second memory location that is sufficiently large to hold the value of the popcount.
摘要:
A mechanism is provided for managing a process-to-process intra-cluster communication request. A call from a first application is received in a first operating system in a first data processing system. The first operating system passes the call from the first operating system to a first host fabric interface controller in the first data processing system without processing the call. The first host fabric interface controller processes the call without intervention by the first operating system to determine a second data processing system in the plurality of data processing systems with which the call is associated. The first host fabric interface controller initiates an intra-cluster connection to a second host fabric interface controller in the second data processing system. The first host fabric interface controller then transfers the call to the second host fabric interface controller in the second data processing system via the intra-cluster connection.
摘要:
A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. If executing a portion of the set of helper thread binaries results in the retrieval of data needed by the set of main thread binaries, then that retrieved data is utilized by the set of main thread binaries.