摘要:
A method and apparatus for accommodating the speed requirements of a DMA read request from PCI protocol I/O devices attached via a DMA to a multiprocessor system mesh. A bridge between the device controller and the mesh is described which buffers the data from the memory in cache lines from which the data is delivered finally to the I/O device. The system is adaptive in that the number of cache lines required in past reads are remembered and used to determine if the number of cache lines is reduced or increased.
摘要:
A method and system for completing pending I/O device reads by periodically stalling the issuance of I/O device accesses by a program in a multiple-processor computer system.
摘要:
An anti-starvation interrupt protocol for use in avoiding livelock in a multiprocessor computer system is provided. At least one processor is configured to include first and second control status registers (CSRs). The first CSR buffers information, such as interrupts, received by the processor, while the second CSR keeps track of the priority level of the interrupts. When an interrupt controller receives an interrupt, it issues a write transaction to the first CSR at the processor. If the first CSR has room to accept the write transaction, the processor returns an acknowledgement, whereas if the first CSR is already full, the processor returns a no acknowledgment. In response to a no acknowledgment, the interrupt controller increments an interrupt starvation counter, and checks to see whether the counter exceeds a threshold. If not, the interrupt controller waits a preset time and reposts the write transaction. If it does, the interrupt controller issues a write transaction having a higher priority to the second CSR. In response, the processor copies all of the pending interrupts from the first CSR into the memory subsystem, thereby freeing up the first CSR to accept additional write transactions.
摘要:
A system that supports a high performance, scalable, and efficient I/O port protocol to connect to I/O devices is disclosed. A distributed multiprocessing computer system contains a number of processors each coupled to an I/O bridge ASIC implementing the I/O port protocol. One or more I/O devices are coupled to the I/O bridge ASIC, each I/O device capable of accessing machine resources in the computer system by transmitting and receiving message packets. Machine resources in the computer system include data blocks, registers and interrupt queues. Each processor in the computer system is coupled to a memory module capable of storing data blocks shared between the processors. Coherence of the shared data blocks in this shared memory system is maintained using a directory based coherence protocol. Coherence of data blocks transferred during I/O device read and write accesses is maintained using the same coherence protocol as for the memory system. Data blocks transferred during an I/O device read or write access may be buffered in a cache by the I/O bridge ASIC only if the I/O bridge ASIC has exclusive copies of the data blocks. The I/O bridge ASIC includes a DMA device that supports both in-order and out-of-order DMA read and write streams of data blocks. An in-order stream of reads of data blocks performed by the DMA device always results in the DMA device receiving coherent data blocks that do not have to be written back to the memory module.
摘要:
A system and method are provided for moving information between cache coherent memory systems of a partitioned multiprocessor computer system while containing faults to a single partition. The multiprocessor computer system includes a plurality of processors, memory subsystems and input/output (I/O) subsystems that can be divided into a plurality of partitions. Each I/O subsystem includes at least one I/O bridge for interfacing between one or more I/O devices and the multiprocessor system. The I/O bridge has a data mover configured to retrieve information from a “source” partition and to store that information within its own “destination” partition. When activated, the data mover issues a request to the source partition for a non-coherent copy of the information. The home memory subsystem in the source partition preferably responds to the request by sending the data mover “valid”, but non-coherent copy of the information, e.g., a “snapshot” of the information as of the time of the request. Upon receiving the information, the data mover may copy it into the memory subsystem of the destination partition.
摘要:
A system that supports a high performance, scalable, and efficient I/O port protocol to connect to I/O devices is disclosed. A distributed multiprocessing computer system contains a number of processors each coupled to an I/O bridge ASIC implementing the I/O port protocol. One or more I/O devices are coupled to the I/O bridge ASIC, each I/O device capable of accessing machine resources in the computer system by transmitting and receiving message packets. Machine resources in the computer system include data blocks, registers and interrupt queues. Each processor in the computer system is coupled to a memory module capable of storing data blocks shared between the processors. Coherence of the shared data blocks in this shared memory system is maintained using a directory based coherence protocol. Coherence of data blocks transferred during I/O device read and write accesses is maintained using the same coherence protocol as for the memory system. Data blocks transferred during an I/O device read or write access may be buffered in a cache by the I/O bridge ASIC only if the I/O bridge ASIC has exclusive copies of the data blocks. The I/O bridge ASIC includes a DMA device that supports both in-order and out-of-order DMA read and write streams of data blocks. An in-order stream of reads of data blocks performed by the DMA device always results in the DMA device receiving coherent data blocks that do not have to be written back to the memory module.
摘要:
A system and method avoids passive release of interrupts in a computer system. The computer system includes a plurality of processors, a plurality of input/output (I/O) devices each capable of issuing interrupts, and an I/O bridge interfacing between the I/O devices and the processors. Interrupts, such as level sensitive interrupts (LSIs), asserted by an I/O device coupled to a specific port of the I/O bridge are sent to a processor for servicing by an interrupt controller, which also sets an interrupt pending flag. Upon dispatching the respective interrupt service routine, the processor generates two ordered messages. The first ordered message is sent to the I/O device that triggered the interrupt, informing it that the interrupt has been serviced. The second ordered message directs the interrupt controller to clear the respective interrupt pending flag. Both messages are sent, in order, to the particular I/O bridge port to which the subject I/O device is coupled. After forwarding the first message to the I/O device, the bridge port forwards the second message to the interrupt controller so that the interrupt can be deasserted before the interrupt pending flag is cleared.
摘要:
A system and method avoids “livelock” and “starvation” among two or more input/output (I/O) devices of a symmetrical multiprocessor (SMP) computer system competing for the same data. The SMP computer system includes a plurality of interconnected processors, one or more memories that are shared by the processors, and a plurality of I/O bridges to which the I/O devices are coupled. A cache coherency protocol is executed the I/O bridges, which requires the I/O bridges to obtain “exclusive” (not shared) ownership of all data stored by the bridges. In response to a request for data currently stored by an I/O bridge, the bridge first copies at least a portion of that data to a non-coherent buffer before invalidating the data. The bridge then takes the largest amount of the data saved in its non-coherent buffer that its knows to be coherent, and releases only that known coherent amount to the I/O device, and then discards all of the saved data.
摘要:
A system and method avoids “livelock” and “starvation” among two or more input/output (I/O) devices of a symmetrical multiprocessor (SMP) computer system competing for the same data. The SMP computer system includes a plurality of interconnected processors, one or more memories that are shared by the processors, and a plurality of I/O bridges to which the I/O devices are coupled. A cache coherency protocol is executed the I/O bridges, which requires the I/O bridges to obtain “exclusive” (not shared) ownership of all data stored by the bridges. In response to a request for data currently stored by an I/O bridge, the bridge first copies at least a portion of that data to a non-coherent buffer before invalidating the data. The bridge then takes the largest amount of the data saved in its non-coherent buffer that its knows to be coherent, and releases only that known coherent amount to the I/O device, and then discards all of the saved data.
摘要:
A time slice group (TSG) is a grouping of different streams of work (referred to herein as “channels”) that share the same context information. The set of channels belonging to a TSG are processed in a pre-determined order. However, when a channel stalls while processing, the next channel with independent work can be switched to fully load the parallel processing unit. Importantly, because each channel in the TSG shares the same context information, a context switch operation is not needed when the processing of a particular channel in the TSG stops and the processing of a next channel in the TSG begins. Therefore, multiple independent streams of work are allowed to run concurrently within a single context increasing utilization of parallel processing units.