Patent search ap:("NVIDIA Corporation") AND inv:"Ming Y. Siu" Page 1

1.

发明公开
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION 审中-公开

公开(公告)号：US20240248718A1

公开(公告)日：2024-07-25

申请号：US18625903

申请日：2024-04-03

Applicant: NVIDIA Corporation

Inventor： Jeffrey Michael Pool , Andrew Kerr , John Tran , Ming Y. Siu , Stuart Oberman

IPC: G06F9/30 , G06N20/00

CPC classification number: G06F9/30043 , G06F9/3001 , G06F9/30021 , G06F9/30094 , G06F9/30145 , G06F9/30098 , G06N20/00

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

2.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11797302B2

公开(公告)日：2023-10-24

申请号：US17351161

申请日：2021-06-17

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

3.

发明授权
Decompression techniques for processing compressed data suitable for artificial neural networks 有权

公开(公告)号：US11379420B2

公开(公告)日：2022-07-05

申请号：US16359787

申请日：2019-03-20

Applicant: NVIDIA Corporation

Inventor： Jorge Albericio Latorre , Jack H. Choquette , Manan Maheshkumar Patel , Jeffrey Pool , Ming Y. Siu , Ronny Meir Krashinsky , Ganesh Venkatesh

IPC: G06F16/174 , G06F16/901 , G06N3/08 , H03M7/30 , G06F16/14

Abstract: Compressed data is oftentimes beneficial for reducing the computing resources required, for example, to transmit and store data. The compression of data is particularly useful when dealing with sparse data (data that includes numerous zeros or near-zero values) and only non-zero values above a certain threshold have significance. When dealing with compressed data, oftentimes the data needs to be decompressed for processing (e.g., by deep learning networks or other applications configured to operate on sparse, or other uncompressed data). Instructions are disclosed for supporting the decompression of compressed data by a processing unit such as a CPU and GPU.

4.

发明授权
Providing hints to an execution unit to prepare for predicted subsequent arithmetic operations 有权

公开(公告)号：US11150721B2

公开(公告)日：2021-10-19

申请号：US13671485

申请日：2012-11-07

Applicant: NVIDIA Corporation

Inventor： David Conrad Tannenbaum , Ming Y. Siu , Stuart F Oberman , Colin Sprinkle , Srinivasan Iyer , Ian Chi Yan Kwong

IPC: G06F1/3287 , G06F9/38 , G06F8/41

Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.

5.

发明申请
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS 有权

公开(公告)号：US20210311733A1

公开(公告)日：2021-10-07

申请号：US17351161

申请日：2021-06-17

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

6.

发明申请
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS 有权

公开(公告)号：US20210303302A1

公开(公告)日：2021-09-30

申请号：US17141082

申请日：2021-01-04

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

7.

发明申请
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS 审中-公开

公开(公告)号：US20190324747A1

公开(公告)日：2019-10-24

申请号：US16459191

申请日：2019-07-01

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

8.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11816481B2

公开(公告)日：2023-11-14

申请号：US17890540

申请日：2022-08-18

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06F9/38 , G06T1/20

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

9.

发明授权
Generalized acceleration of matrix multiply accumulate operations 有权

公开(公告)号：US11797301B2

公开(公告)日：2023-10-24

申请号：US17141082

申请日：2021-01-04

Applicant: NVIDIA Corporation

Inventor： Brent Ralph Boswell , Ming Y. Siu , Jack H. Choquette , Jonah M. Alben , Stuart Oberman

IPC: G06F9/30 , G06T1/20 , G06F9/38

CPC classification number: G06F9/30014 , G06F9/3001 , G06F9/3012 , G06F9/30036 , G06F9/3851 , G06T1/20

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

10.

发明公开
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION 审中-公开

公开(公告)号：US20230221957A1

公开(公告)日：2023-07-13

申请号：US18112923

申请日：2023-02-22

Applicant: NVIDIA Corporation

Inventor： Jeffrey Michael Pool , Andrew Kerr , John Tran , Ming Y. Siu , Stuart Oberman

IPC: G06F9/30

CPC classification number: G06F9/30043 , G06F9/30021 , G06F9/30145

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification