-
公开(公告)号:US11726857B2
公开(公告)日:2023-08-15
申请号:US17374592
申请日:2021-07-13
Applicant: NVIDIA Corporation
Inventor: Yilin Zhang , Shangang Zhang , Yan Zhou , Qifei Fan
CPC classification number: G06F11/079 , G06F7/5443 , G06F11/0724 , G06F11/0751 , G06N3/065
Abstract: Apparatuses, systems, and techniques to detect faults in processing pipelines are described. One accelerator circuit includes a fixed-function circuit that performs an operation corresponding to a layer of a neural network. The fixed-function circuit includes a set of homogeneous processing units and a fault scanner circuit. The fault scanner circuit includes an additional homogeneous processing unit to scan each processing unit of the set for functional faults in a sequence.
-
公开(公告)号:US11983566B2
公开(公告)日:2024-05-14
申请号:US17374361
申请日:2021-07-13
Applicant: NVIDIA Corporation
Inventor: Yilin Zhang , Geng Chen , Yan Zhou , Qifei Fan , Prashant Gaikwad
CPC classification number: G06F9/5027 , G06F9/4881 , G06N3/063
Abstract: Apparatuses, systems, and techniques for scheduling deep learning tasks in hardware are described. One accelerator circuit includes multiple fixed-function circuits that each processes a different layer type of a neural network. A scheduler circuit receives state information associated with a respective layer being processed by a respective fixed-function circuit and dependency information that indicates a layer dependency condition for the respective layer. The scheduler circuit determines that the layer dependency condition is satisfied using the state information and the dependency information and enables the fixed-function circuit to process the current layer at the respective fixed-function circuit.
-
公开(公告)号:US20220382592A1
公开(公告)日:2022-12-01
申请号:US17374361
申请日:2021-07-13
Applicant: NVIDIA Corporation
Inventor: Yilin Zhang , Geng Chen , Yan Zhou , Qifei Fan , Prashant Gaikwad
Abstract: Apparatuses, systems, and techniques for scheduling deep learning tasks in hardware are described. One accelerator circuit includes multiple fixed-function circuits that each processes a different layer type of a neural network. A scheduler circuit receives state information associated with a respective layer being processed by a respective fixed-function circuit and dependency information that indicates a layer dependency condition for the respective layer. The scheduler circuit determines that the layer dependency condition is satisfied using the state information and the dependency information and enables the fixed-function circuit to process the current layer at the respective fixed-function circuit.
-
公开(公告)号:US20220374298A1
公开(公告)日:2022-11-24
申请号:US17374592
申请日:2021-07-13
Applicant: NVIDIA Corporation
Inventor: Yilin Zhang , Shangang Zhang , Yan Zhou , Qifei Fan
Abstract: Apparatuses, systems, and techniques to detect faults in processing pipelines are described. One accelerator circuit includes a fixed-function circuit that performs an operation corresponding to a layer of a neural network. The fixed-function circuit includes a set of homogeneous processing units and a fault scanner circuit. The fault scanner circuit includes an additional homogeneous processing unit to scan each processing unit of the set for functional faults in a sequence.
-
公开(公告)号:US20220413752A1
公开(公告)日:2022-12-29
申请号:US17446257
申请日:2021-08-27
Applicant: NVIDIA Corporation
Inventor: Yilin Zhang , Yan Zhou , Qifei Fan
Abstract: Techniques for providing an overlap data buffer to store portions of tiles between passes of chained layers of a neural network are described. One accelerator circuit includes one or more processing units to execute instructions corresponding to the chained layers in multiple passes. In a first pass, the processing unit(s) receives a first input tile of an input feature map from a primary buffer and performs a first operation on the first input tile to obtain a first output tile. The processing unit stores the first output tile in the primary buffer and identifies a portion of the first output tile as corresponding to overlap data between tiles of the input feature map. The processing unit stores the portion in a secondary buffer. In a second pass, the processing unit retrieves the portion to avoid fetching the portion that overlaps and computing the overlap data again.
-
-
-
-