Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Samantray Biplab Raut"

1.

发明申请
FAST BLOCK-BASED PARALLEL MESSAGE PASSING INTERFACE TRANSPOSE 有权

公开(公告)号：US20220121506A1

公开(公告)日：2022-04-21

申请号：US17071876

申请日：2020-10-15

Applicant: Advanced Micro Devices, Inc.

Inventor： Samantray Biplab Raut

IPC: G06F9/54 , G06F9/52

Abstract: Computer-implemented techniques for fast block-based parallel message passing interface (MPI) transpose are disclosed. The techniques achieve an in-place parallel matrix transpose of an input matrix in a distributed-memory multiprocessor environment with reduced consumption of computer processing time and storage media resources. An in-memory copy of the input matrix or a submatrix thereof to use as the send buffer for MPI send operations is not needed. Instead, by dividing the input matrix in-place into data blocks having up to at most a predetermined size and sending the corresponding data block(s) for a given submatrix using an MPI API before receiving any data block(s) for the given submatrix using an MPI API in the place of the sent data block(s), making the in-memory copy to use a send buffer can be avoided and yet the input matrix can be transposed in-place.

2.

发明授权
Fast block-based parallel message passing interface transpose 有权

公开(公告)号：US11836549B2

公开(公告)日：2023-12-05

申请号：US17071876

申请日：2020-10-15

Applicant: Advanced Micro Devices, Inc.

Inventor： Samantray Biplab Raut

IPC: G06F9/54 , G06F9/52

CPC classification number: G06F9/546 , G06F9/52

Abstract: Computer-implemented techniques for fast block-based parallel message passing interface (MPI) transpose are disclosed. The techniques achieve an in-place parallel matrix transpose of an input matrix in a distributed-memory multiprocessor environment with reduced consumption of computer processing time and storage media resources. An in-memory copy of the input matrix or a submatrix thereof to use as the send buffer for MPI send operations is not needed. Instead, by dividing the input matrix in-place into data blocks having up to at most a predetermined size and sending the corresponding data block(s) for a given submatrix using an MPI API before receiving any data block(s) for the given submatrix using an MPI API in the place of the sent data block(s), making the in-memory copy to use a send buffer can be avoided and yet the input matrix can be transposed in-place.

Patent Agency Ranking