摘要:
A method, program and system for associating memory windows with memory regions in an infiniband data storage system are provided. The invention comprises registering a Memory Region, wherein the Memory Region is a set of virtually contiguous memory addresses defined by a virtual address and length. The system then establishes and maintains a Window Reference Count (WRC) for the Memory Region, which tracks the number of Memory Windows which are bound to the Memory Region. When the system binds a Memory Window to the Memory Region, the value of the WRC is incremented. When a Memory Window is unbound from the Memory Region, the value of the WRC is decremented. If no Memory Windows are bound to the Memory Region, the value of the WRC is zero. The Memory Region is not deregistered unless the value of the WRC equals zero.
摘要:
A method, computer program product, and data processing system for sharing memory protection tables and address translation tables among multiple Host Channel Adapters are disclosed. The protection and address translation tables for a shared memory region are written in memory of the host. The Host Channel Adapters are registered with the memory region so that each adapter stores an address pointer to the tables. In this way, the tables need not be duplicated for each adapter.
摘要:
A method, apparatus, and computer program product are disclosed in a data processing system for migrating data pages subject to DMA access by temporarily disabling selected DMA operations within a physical I/O adapter. A determination is made as to whether to disable data access DMA capabilities of the physical I/O adapter. An operating mode of the physical I/O adapter is set to a particular mode utilizing a mode bit according to the determination of whether to disable data access DMA capabilities. Only data access DMA capabilities of the physical I/O adapter are disabled when the mode bit is set. Administrative services operations continue to be performed by the physical I/O adapter when the data access DMA capabilities of the physical I/O adapter are disabled.
摘要:
A method, system, and product in a data processing system are disclosed for managing data transmitted from a first end node to a second end node included in the data processing system. A logical connection is established between the first end node and the second end node prior to transmitting data between the end nodes. An instance number is associated with this particular logical connection. The instance number is included in each packet transmitted between the end nodes while this logical connection remains established. The instance number remains constant during this logical connection. The instance number is altered, such as by incrementing it, each time a logical connection between these end nodes is reestablished. Thus, each packet is associated with a particular instance of the logical connection. When a packet is received, the instance number included in the packet may be used to determine whether the packet is a stale packet transmitted during a previous logical connection between these end nodes.
摘要:
An apparatus and method for managing work and completion queues using head and tail circular pointers. With the apparatus and method, queue head and tail pointers are maintained in the channel interface and the host channel adapter. The head and tail pointers in the host channel adapter include a queue pointer table index and a queue page index for identifying a position within the queue. For work queues, the tail pointer in the channel interface is used to identify a next position where a work queue entry may be written. The head pointer in the channel interface is used only to determine whether the work queue is full or not. The head pointer in the host channel adapter is used to identify a next work queue entry for processing by the host channel adapter. The tail pointer in the host channel adapter is used by the host channel adapter to determine if the queue is empty. For completion queues, the head pointer in the channel interface is used to identify a next completion queue entry to be processed. The tail pointer in the host channel adapter is used to identify a next position in the completion queue to which the host channel adapter may post a completion queue entry.
摘要:
A mechanism for allowing a single physical IB node to virtualize a plurality of host channel adapters is provided. This includes providing the appearance of both a router and multiple virtual HCA's residing behind that router, to the external REAL subnet components. Each virtual host channel adapter will have unique access control levels. One or more InfiniBand subnets are virtualized in such a way that nodes residing both within the virtual subnets and in separate physical subnets are completely unaware of the virtualization. This virtualization of InfiniBand subnets significantly increases the horizontal scaling capabilities of a single InfiniBand physical component, while at the same time provides “native” network throughput for all the virtual hosts.
摘要:
An apparatus and method for swapping out real memory by inhibiting input/output (I/O) operations to a memory region are provided. The apparatus and method provide a mechanism in which a quiesce indicator is provided in a field containing the current outstanding I/O count associated with the memory region whose real memory is to be swapped out. The current I/O field and the quiesce indicator are used as a means for communicating between a shared resource arbitrator and a guest consumer. When the quiesce indicator is set, the guest consumer is informed that it should not send any further I/O operations to that memory region. When the number of pending I/O operations against the memory region is zero, a valid bit in a protection table is set to invalid, and the real memory associated with the memory region may be swapped out. Thereafter, when the memory region is swapped back in, an address translation table is updated, the valid bit is reset, and the quiesce indicator is reset so that further I/O operations to the memory region may occur.
摘要:
An apparatus and method for managing reliable datagram work queues, and associated completion queues, using head and tail pointers with end-to-end context error cache are provided. Reliable datagram (RD) queue head and tail pointers are maintained in the channel interface and the host channel adapter. The head and tail pointers in the host channel adapter include a RD queue page table index and a RD queue page index for identifying a position within the RD queue. For RD work queues, in the channel interface, the tail pointer is used to identify a next position where a work queue entry may be written and the head pointer is used only to determine whether the work queue is full. In the host channel adapter, the head pointer is used to identify a next work queue entry for processing and the tail pointer is used to determine if the queue is empty.
摘要:
A distributed computing system having (host and I/O) end nodes, switches, routers, and links interconnecting these components is provided. The end nodes use send and receive queue pairs to transmit and receive messages. The end nodes use completion queues to inform the end user when a message has been completely sent or received and whether an error occurred during the message transmission or reception process. A mechanism implements these queue pairs and completion queues in hardware. A mechanism for controlling the transfer of work requests from the consumer to the CA hardware and work completions from the CA hardware to the consumer using head and tail pointers that reference circular buffers is also provided. The QPs and CQs do not contain Work Queue Entries and Completion Queue Entries respectively, but instead contain references to these entries. This allows them to be efficient and constant in size, while the Work Queue Entries and Completion Queue Entries themselves can vary in size, for example to include a variable number of data segments. Additionally, several mechanisms are provided to improve the overall efficiency of this process under different memory configurations.
摘要:
Defines and handles segments in messages to place pauses and interruptions within the communication of a message between transmitted segments of the message. A common link switch is used in a network to connect links to all nodes, the segment structures in each message is preserved when packets of each message are passed within the switch to a switch transmitter connected to the destination node indicated in each packet of the message for transmitting each of the message segments. Each transmitter stores the source identifier of the first packet it transmits for a segment and then gives priority to transmitting packets which contain source and destination identifiers which match the current transmitter stored source identifier and match the destination node connected to the transmitter. This priority enables each switch transmitter to interleaves segments of concurrent messages while preserving the segmentation of transmitted packets to maintaining a maximum network communication rate for the messages. When an unexpected wait occurs within a transmitting segment, which exceeds a predetermined time-out period, a transmission of any other waiting segment is started, which improves the message transmission efficiency in the network.