Collective communication system and methods

    公开(公告)号:US12177039B2

    公开(公告)日:2024-12-24

    申请号:US18513565

    申请日:2023-11-19

    Abstract: A method includes providing a plurality of processes interconnected by a network, each of the plurality of processes being configured to hold a block of data destined for others of the plurality of processes. A set of data for all-to-all data exchange is received from one or more of the processes. The set of data is configured as a plurality of blocks of data in a matrix as matrix data, the matrix being distributed among the plurality of processes. The matrix data is transposed by changing the position of selected blocks of data of the plurality of blocks of data relative to the other blocks of data of the plurality of the blocks of data, without changing the structure of each of the blocks of data. The transposed matrix data is over the network and is then received, repacked, and conveyed to destination processes.

    Single-step collective operations
    24.
    发明公开

    公开(公告)号:US20240095106A1

    公开(公告)日:2024-03-21

    申请号:US18105846

    申请日:2023-02-05

    Inventor: Richard Graham

    CPC classification number: G06F9/546

    Abstract: A method for collective communications includes invoking a collective operation over a group of computing processes in which the processes concurrently transmit and receive data to and from other processes in the group via a communication medium. Messages are composed for transmission by source processes including metadata indicating how the data to be transmitted by the source processes in the collective operation are to be handled by destination processes that are to receive the data and also including in at least some of the messages the data to be transmitted by one or more of the source processes to one or more of the destination processes. The composed messages are transmitted concurrently from the source processes to the destination processes in the group over the communication medium. The data are processed by the destination processes in response to the metadata included in the messages received by the destination processes.

    Selective aggregation of messages in collective operations

    公开(公告)号:US20240086265A1

    公开(公告)日:2024-03-14

    申请号:US18074563

    申请日:2022-12-05

    Inventor: Richard Graham

    CPC classification number: G06F9/546

    Abstract: A method for collective communications includes invoking a collective operation over a group of computing processes in which the processes in the group concurrently transmit and receive data messages to and from other processes in the group via a communication medium. The processes detect respective sizes of the data messages and transmit the data messages for which the respective sizes are greater than a predefined threshold to respective destination processes in the group without aggregation. The data messages for which the respective sizes are less than the predefined threshold are aggregated, and the aggregated data messages are transmitted to the respective destination processes.

    Efficient scatter-gather over an uplink

    公开(公告)号:US10887252B2

    公开(公告)日:2021-01-05

    申请号:US16181376

    申请日:2018-11-06

    Abstract: A network interface device is connected to a host computer by having a memory controller, and a scatter-gather offload engine linked to the memory controller. The network interface device prepares a descriptor including a plurality of specified memory locations in the host computer, incorporates the descriptor in exactly one upload packet, transmits the upload packet to the scatter-gather offload engine via the uplink, invokes the scatter-gather offload engine to perform memory access operations cooperatively with the memory controller at the specified memory locations of the descriptor, and accepts results of the memory access operations.

    Mechanism for Distributing MPI Tag Matching
    30.
    发明申请

    公开(公告)号:US20180219804A1

    公开(公告)日:2018-08-02

    申请号:US15881844

    申请日:2018-01-29

    CPC classification number: H04L49/901 G06F9/54 H04L45/306 H04L47/34

    Abstract: Network communication is carried out by transmitting messages in accordance with a predefined data exchange protocol among nodes that include a master domain and a plurality of client domains. A list of expected messages has a tail portion in the master domain and respective head portions in the client domains. A search is conducted for a match between the tag of a received message to tags in a list of unexpected messages that is maintained in the master domain. Upon a failure to find the match the receive is added to the list of expected messages. If a match is found then data in the message is written into a data buffer.

Patent Agency Ranking