-
1.
公开(公告)号:US20220374961A1
公开(公告)日:2022-11-24
申请号:US17325116
申请日:2021-05-19
Applicant: NVIDIA CORPORATION
Inventor: Hanrui Wang , James Michael O'Connor , Donghyuk Lee
IPC: G06Q30/06 , G06F16/901 , G06F17/16
Abstract: One embodiment sets forth a technique for performing matrix operations. The technique includes traversing a tree structure to access one or more non-empty regions within a matrix. The tree structure includes a first plurality of nodes and a second plurality of nodes corresponding to non-empty regions in the matrix. The first plurality of nodes includes a first node representing a first region and one or more second nodes that are children of the first node and represent second region(s) with an equal size formed within the first region. The second plurality of nodes include a third node representing a third region and one or more fourth nodes that are children of the third node and represent fourth region(s) with substantially equal numbers of non-zero matrix values formed within the third region. The technique also includes performing matrix operation(s) based on the non-empty region(s) to generate a matrix operation result.
-
公开(公告)号:US11709812B2
公开(公告)日:2023-07-25
申请号:US17325133
申请日:2021-05-19
Applicant: NVIDIA CORPORATION
Inventor: Hanrui Wang , James Michael O'Connor , Donghyuk Lee
IPC: G06F16/22 , G06F17/16 , G06F18/2134
CPC classification number: G06F16/2237 , G06F16/2246 , G06F16/2272 , G06F17/16 , G06F18/21345
Abstract: One embodiment sets forth a technique for generating a tree structure within a computer memory for storing sparse data. The technique includes dividing a matrix into a first plurality of equally sized regions. The technique also includes dividing at least one region in the first plurality of regions into a second plurality of regions, where the second plurality of regions includes a first region and one or more second regions that have a substantially equal number of nonzero matrix values and are formed within the first region. The technique further includes creating the tree structure within the computer memory by generating a first plurality of nodes representing the first plurality of regions, generating a second plurality of nodes representing the second plurality of regions, and grouping, under a first node representing the first region, one or more second nodes representing the one or more second regions.
-
3.
公开(公告)号:US12211080B2
公开(公告)日:2025-01-28
申请号:US17325116
申请日:2021-05-19
Applicant: NVIDIA CORPORATION
Inventor: Hanrui Wang , James Michael O'Connor , Donghyuk Lee
IPC: G06F16/901 , G06F17/16 , G06Q30/0601
Abstract: One embodiment sets forth a technique for performing matrix operations. The technique includes traversing a tree structure to access one or more non-empty regions within a matrix. The tree structure includes a first plurality of nodes and a second plurality of nodes corresponding to non-empty regions in the matrix. The first plurality of nodes includes a first node representing a first region and one or more second nodes that are children of the first node and represent second region(s) with an equal size formed within the first region. The second plurality of nodes include a third node representing a third region and one or more fourth nodes that are children of the third node and represent fourth region(s) with substantially equal numbers of non-zero matrix values formed within the third region. The technique also includes performing matrix operation(s) based on the non-empty region(s) to generate a matrix operation result.
-
公开(公告)号:US12141229B2
公开(公告)日:2024-11-12
申请号:US17325120
申请日:2021-05-19
Applicant: NVIDIA CORPORATION
Inventor: Hanrui Wang , James Michael O'Connor , Donghyuk Lee
IPC: G06F17/16 , G06F9/50 , G06F16/901
Abstract: One embodiment sets forth a technique for performing one or more matrix multiplication operations based on a first matrix and a second matrix. The technique includes receiving data associated with the first matrix from a first traversal engine that accesses nonzero elements included in the first matrix via a first tree structure. The technique also includes performing one or more computations on the data associated with the first matrix and the data associated with the second matrix to produce a plurality of partial results. The technique further includes combining the plurality of partial results into one or more intermediate results and storing the one or more intermediate results in a first buffer memory.
-
-
-