Patent search ap:("NVIDIA Corporation") AND inv:"Yan Zhou" Page 1

1.

发明授权
Hardware-based fault scanner to detect faults in homogeneous processing units 有权

公开(公告)号：US11726857B2

公开(公告)日：2023-08-15

申请号：US17374592

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Shangang Zhang , Yan Zhou , Qifei Fan

IPC: G06F11/07 , G06F7/544 , G06N3/065

CPC classification number: G06F11/079 , G06F7/5443 , G06F11/0724 , G06F11/0751 , G06N3/065

Abstract: Apparatuses, systems, and techniques to detect faults in processing pipelines are described. One accelerator circuit includes a fixed-function circuit that performs an operation corresponding to a layer of a neural network. The fixed-function circuit includes a set of homogeneous processing units and a fault scanner circuit. The fault scanner circuit includes an additional homogeneous processing unit to scan each processing unit of the set for functional faults in a sequence.

2.

发明授权
Hardware circuit for deep learning task scheduling 有权

公开(公告)号：US11983566B2

公开(公告)日：2024-05-14

申请号：US17374361

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Geng Chen , Yan Zhou , Qifei Fan , Prashant Gaikwad

IPC: G06F9/50 , G06F9/48 , G06N3/063

CPC classification number: G06F9/5027 , G06F9/4881 , G06N3/063

Abstract: Apparatuses, systems, and techniques for scheduling deep learning tasks in hardware are described. One accelerator circuit includes multiple fixed-function circuits that each processes a different layer type of a neural network. A scheduler circuit receives state information associated with a respective layer being processed by a respective fixed-function circuit and dependency information that indicates a layer dependency condition for the respective layer. The scheduler circuit determines that the layer dependency condition is satisfied using the state information and the dependency information and enables the fixed-function circuit to process the current layer at the respective fixed-function circuit.

3.

发明申请
HARDWARE CIRCUIT FOR DEEP LEARNING TASK SCHEDULING 有权

公开(公告)号：US20220382592A1

公开(公告)日：2022-12-01

申请号：US17374361

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Geng Chen , Yan Zhou , Qifei Fan , Prashant Gaikwad

IPC: G06F9/50 , G06F9/48 , G06N3/063

Abstract: Apparatuses, systems, and techniques for scheduling deep learning tasks in hardware are described. One accelerator circuit includes multiple fixed-function circuits that each processes a different layer type of a neural network. A scheduler circuit receives state information associated with a respective layer being processed by a respective fixed-function circuit and dependency information that indicates a layer dependency condition for the respective layer. The scheduler circuit determines that the layer dependency condition is satisfied using the state information and the dependency information and enables the fixed-function circuit to process the current layer at the respective fixed-function circuit.

4.

发明申请
HARDWARE-BASED FAULT SCANNER TO DETECT FAULTS IN HOMOGENEOUS PROCESSING UNITS 有权

公开(公告)号：US20220374298A1

公开(公告)日：2022-11-24

申请号：US17374592

申请日：2021-07-13

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Shangang Zhang , Yan Zhou , Qifei Fan

IPC: G06F11/07 , G06F7/544

Abstract: Apparatuses, systems, and techniques to detect faults in processing pipelines are described. One accelerator circuit includes a fixed-function circuit that performs an operation corresponding to a layer of a neural network. The fixed-function circuit includes a set of homogeneous processing units and a fault scanner circuit. The fault scanner circuit includes an additional homogeneous processing unit to scan each processing unit of the set for functional faults in a sequence.

5.

发明申请
MEMORY MANAGEMENT FOR OVERLAP DATA BETWEEN TILES OF NEURAL NETWORKS 有权

公开(公告)号：US20220413752A1

公开(公告)日：2022-12-29

申请号：US17446257

申请日：2021-08-27

Applicant: NVIDIA Corporation

Inventor： Yilin Zhang , Yan Zhou , Qifei Fan

IPC: G06F3/06 , G06N3/04

Abstract: Techniques for providing an overlap data buffer to store portions of tiles between passes of chained layers of a neural network are described. One accelerator circuit includes one or more processing units to execute instructions corresponding to the chained layers in multiple passes. In a first pass, the processing unit(s) receives a first input tile of an input feature map from a primary buffer and performs a first operation on the first input tile to obtain a first output tile. The processing unit stores the first output tile in the primary buffer and identifies a portion of the first output tile as corresponding to overlap data between tiles of the input feature map. The processing unit stores the portion in a secondary buffer. In a second pass, the processing unit retrieves the portion to avoid fetching the portion that overlaps and computing the overlap data again.

Patent Agency Ranking