Patent search ap:("NVIDIA Corporation") AND inv:"Yilin Zhang" Page 1

1.

发明授权
Hardware circuit for deep learning task scheduling 有权

公开(公告)号：US11983566B2

公开(公告)日：2024-05-14

申请号：US17374361

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Geng Chen , Yan Zhou , Qifei Fan , Prashant Gaikwad

IPC: G06F9/50 , G06F9/48 , G06N3/063

CPC classification number: G06F9/5027 , G06F9/4881 , G06N3/063

Abstract: Apparatuses, systems, and techniques for scheduling deep learning tasks in hardware are described. One accelerator circuit includes multiple fixed-function circuits that each processes a different layer type of a neural network. A scheduler circuit receives state information associated with a respective layer being processed by a respective fixed-function circuit and dependency information that indicates a layer dependency condition for the respective layer. The scheduler circuit determines that the layer dependency condition is satisfied using the state information and the dependency information and enables the fixed-function circuit to process the current layer at the respective fixed-function circuit.

2.

发明授权
Automated methods for conversions to a lower precision data format 有权

公开(公告)号：US10997492B2

公开(公告)日：2021-05-04

申请号：US15838273

申请日：2017-12-11

Applicant: NVIDIA Corporation

Inventor： Szymon Migacz , Hao Wu , Dilip Sequeira , Ujval Kapasi , Maxim Milakov , Slawomir Kierat , Zacky Zhou , Yilin Zhang , Alex Fit-Florea

IPC: G06N3/04 , G06N3/08 , G06N3/063 , G06N3/02 , G06N3/10 , G06N7/00 , G06T9/00

Abstract: Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format. Embodiments of the present invention generate candidate conversions of data output, then employ a relative measure of quality to identify the candidate conversion with the greatest accuracy (i.e., least divergence from the original higher precision values). The representation can be then be used during inference to perform computations on the resulting output data.

3.

发明授权
Hardware-based fault scanner to detect faults in homogeneous processing units 有权

公开(公告)号：US11726857B2

公开(公告)日：2023-08-15

申请号：US17374592

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Shangang Zhang , Yan Zhou , Qifei Fan

IPC: G06F11/07 , G06F7/544 , G06N3/065

CPC classification number: G06F11/079 , G06F7/5443 , G06F11/0724 , G06F11/0751 , G06N3/065

Abstract: Apparatuses, systems, and techniques to detect faults in processing pipelines are described. One accelerator circuit includes a fixed-function circuit that performs an operation corresponding to a layer of a neural network. The fixed-function circuit includes a set of homogeneous processing units and a fault scanner circuit. The fault scanner circuit includes an additional homogeneous processing unit to scan each processing unit of the set for functional faults in a sequence.

4.

发明申请
HARDWARE-BASED FAULT SCANNER TO DETECT FAULTS IN HOMOGENEOUS PROCESSING UNITS 有权

公开(公告)号：US20220374298A1

公开(公告)日：2022-11-24

申请号：US17374592

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Shangang Zhang , Yan Zhou , Qifei Fan

IPC: G06F11/07 , G06F7/544

Abstract: Apparatuses, systems, and techniques to detect faults in processing pipelines are described. One accelerator circuit includes a fixed-function circuit that performs an operation corresponding to a layer of a neural network. The fixed-function circuit includes a set of homogeneous processing units and a fault scanner circuit. The fault scanner circuit includes an additional homogeneous processing unit to scan each processing unit of the set for functional faults in a sequence.

5.

发明申请
AUTOMATED METHODS FOR CONVERSIONS TO A LOWER PRECISION DATA FORMAT 审中-公开

公开(公告)号：US20180211152A1

公开(公告)日：2018-07-26

申请号：US15838273

申请日：2017-12-11

Applicant: NVIDIA Corporation

Inventor： Szymon Migacz , Hao Wu , Dilip Sequeira , Ujval Kapasi , Maxim Milakov , Slawomir Kierat , Zacky Zhou , Yilin Zhang , Alex Fit-Florea

IPC: G06N3/04 , G06N3/08

CPC classification number: G06N3/04 , G06N3/0454 , G06N3/08 , G06N7/00

Abstract: Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format. Embodiments of the present invention generate candidate conversions of data output, then employ a relative measure of quality to identify the candidate conversion with the greatest accuracy (i.e., least divergence from the original higher precision values). The representation can be then be used during inference to perform computations on the resulting output data.

6.

发明申请
HARDWARE CIRCUIT FOR DEEP LEARNING TASK SCHEDULING 有权

公开(公告)号：US20220382592A1

公开(公告)日：2022-12-01

申请号：US17374361

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Geng Chen , Yan Zhou , Qifei Fan , Prashant Gaikwad

IPC: G06F9/50 , G06F9/48 , G06N3/063

Abstract: Apparatuses, systems, and techniques for scheduling deep learning tasks in hardware are described. One accelerator circuit includes multiple fixed-function circuits that each processes a different layer type of a neural network. A scheduler circuit receives state information associated with a respective layer being processed by a respective fixed-function circuit and dependency information that indicates a layer dependency condition for the respective layer. The scheduler circuit determines that the layer dependency condition is satisfied using the state information and the dependency information and enables the fixed-function circuit to process the current layer at the respective fixed-function circuit.

7.

发明申请
AUTOMATED METHODS FOR CONVERSIONS TO A LOWER PRECISION DATA FORMAT 有权

公开(公告)号：US20210256348A1

公开(公告)日：2021-08-19

申请号：US17306171

申请日：2021-05-03

Applicant: NVIDIA Corporation

Inventor： Szymon Migacz , Hao Wu , Dilip Sequeira , Ujval Kapasi , Maxim Milakov , Slawomir Kierat , Zacky Zhou , Yilin Zhang , Alex Fit-Florea

IPC: G06N3/04 , G06N3/08

Abstract: Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format. Embodiments of the present invention generate candidate conversions of data output, then employ a relative measure of quality to identify the candidate conversion with the greatest accuracy (i.e., least divergence from the original higher precision values). The representation can be then be used during inference to perform computations on the resulting output data.

8.

发明申请
MEMORY MANAGEMENT FOR OVERLAP DATA BETWEEN TILES OF NEURAL NETWORKS 有权

公开(公告)号：US20220413752A1

公开(公告)日：2022-12-29

申请号：US17446257

申请日：2021-08-27

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Yan Zhou , Qifei Fan

IPC: G06F3/06 , G06N3/04

Abstract: Techniques for providing an overlap data buffer to store portions of tiles between passes of chained layers of a neural network are described. One accelerator circuit includes one or more processing units to execute instructions corresponding to the chained layers in multiple passes. In a first pass, the processing unit(s) receives a first input tile of an input feature map from a primary buffer and performs a first operation on the first input tile to obtain a first output tile. The processing unit stores the first output tile in the primary buffer and identifies a portion of the first output tile as corresponding to overlap data between tiles of the input feature map. The processing unit stores the portion in a secondary buffer. In a second pass, the processing unit retrieves the portion to avoid fetching the portion that overlaps and computing the overlap data again.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification