Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Richard John Heaton"

11.

发明授权
Reducing computation in neural networks using self-modifying code 有权

公开(公告)号：US12073199B2

公开(公告)日：2024-08-27

申请号：US16433786

申请日：2019-06-06

Applicant: Amazon Technologies, Inc.

Inventor： Vignesh Vivekraja , Randy Renfu Huang , Yu Zhou , Ron Diamant , Richard John Heaton

IPC: G06N3/10 , G06F8/41 , G06N3/04

CPC classification number: G06F8/4441 , G06N3/04 , G06N3/10

Abstract: In various implementations, provided are systems and methods for reducing neural network processing. A compiler may generate instructions from source code for a neural network having a repeatable set of operations. The instructions may include a plurality of blocks. The compiler may add an overwrite instruction to the plurality of blocks that, when executed by one or more execution engines, triggers an overwrite action. The overwrite action causes the instructions of subsequent blocks to be overwritten with NOP instructions. The overwrite action is triggered only when a condition is satisfied.

12.

发明授权
Dynamic processing element array expansion 有权

公开(公告)号：US11868895B2

公开(公告)日：2024-01-09

申请号：US18154576

申请日：2023-01-13

Applicant: Amazon Technologies, Inc.

Inventor： Randy Renfu Huang , Ron Diamant , Richard John Heaton

IPC: G06E1/00 , G06E3/00 , G06T7/00 , G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/04

Abstract: A computer-implemented method includes receiving a neural network model that includes a tensor operation, dividing the tensor operation into a set of sub-operations, and generating instructions for performing a plurality of sub-operations of the set of sub-operations on respective computing engines of a plurality of computing engines on a same integrated circuit device or on different integrated circuit devices. Each sub-operation of the set of sub-operations generates a portion of a final output of the tensor operation. An inference is made based on a result of a sub-operation of the plurality of sub-operations, or based on results of the plurality of sub-operations.

13.

发明授权
Neural network processing based on subgraph recognition 有权

公开(公告)号：US11714992B1

公开(公告)日：2023-08-01

申请号：US16219760

申请日：2018-12-13

Applicant: Amazon Technologies, Inc.

Inventor： Richard John Heaton , Randy Renfu Huang , Ron Diamant

IPC: G06F16/00 , G06N3/04 , G06F9/30 , G06F16/901 , G06F9/48

CPC classification number: G06N3/04 , G06F9/4881 , G06F9/30003 , G06F16/9024

Abstract: Systems and methods for providing executable instructions to a neural network processor are provided. In one example, a system comprises a database that stores a plurality of executable instructions and a plurality of subgraph identifiers, each subgraph identifier of the plurality of subgraph identifiers being associated with a subset of instructions of the plurality of executable instructions. The system further includes a compiler configured to: identify a computational subgraph from a computational graph of a neural network model; compute a subgraph identifier for the computational subgraph, based on whether the subgraph identifier is included in the plurality of subgraph identifiers, either: obtain, from the database, first instructions associated with the subgraph identifier; or generate second instructions representing the computational subgraph; and provide the first instructions or the second instructions for execution by a neural network processor to perform computation operations for the neural network model.

14.

发明授权
Low latency neural network model loading 有权

公开(公告)号：US11182314B1

公开(公告)日：2021-11-23

申请号：US16698761

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Drazen Borkovic , Ilya Minkin , Vignesh Vivekraja , Richard John Heaton , Randy Renfu Huang

IPC: G06F13/20 , G06F13/10 , G06N3/04

Abstract: An integrated circuit device implementing a neural network accelerator may have a peripheral bus interface to interface with a host memory, and neural network models can be loaded from the host memory onto the state buffer of the neural network accelerator for execution by the array of processing elements. The neural network accelerator may also have a memory interface to interface with a local memory. The local memory may store neural network models from the host memory, and the models can be loaded from the local memory into the state buffer with reduced latency as compared to loading from the host memory. In systems with multiple accelerators, the models in the local memory can also be shared amongst different accelerators.

15.

发明授权
Hierarchical partitioning of operators 有权

公开(公告)号：US12182688B2

公开(公告)日：2024-12-31

申请号：US16698236

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang

IPC: G06N3/063 , G06N3/04

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

16.

发明公开
NEURAL NETWORK TRAINING IN A DISTRIBUTED SYSTEM 审中-公开

公开(公告)号：US20240232630A1

公开(公告)日：2024-07-11

申请号：US18221454

申请日：2023-07-13

Applicant: Amazon Technologies, Inc.

Inventor： Vignesh Vivekraja , Thiam Khean Hah , Randy Renfu Huang , Ron Diamant , Richard John Heaton

IPC: G06N3/084 , G06N3/045 , G06N3/063 , G06N3/10

CPC classification number: G06N3/084 , G06N3/045 , G06N3/063 , G06N3/10

Abstract: Methods and systems for performing a training operation of a neural network are provided. In one example, a method comprises: performing backward propagation computations for a second layer of a neural network to generate second weight gradients; splitting the second weight gradients into portions; causing a hardware interface to exchange a first portion of the second weight gradients with the second computer system; performing backward propagation computations for a first layer of the neural network to generate first weight gradients when the exchange of the first portion of the second weight gradients is underway, the first layer being a lower layer than the second layer in the neural network; causing the hardware interface to transmit the first weight gradients to the second computer system; and causing the hardware interface to transmit the remaining portions of the second weight gradients to the second computer system.

17.

发明授权
Neural network training in a distributed system 有权

公开(公告)号：US11941528B2

公开(公告)日：2024-03-26

申请号：US16588603

申请日：2019-09-30

Applicant: Amazon Technologies, Inc.

Inventor： Vignesh Vivekraja , Thiam Khean Hah , Randy Renfu Huang , Ron Diamant , Richard John Heaton

IPC: G06N3/084 , G06N3/063 , G06N3/045 , G06N3/10

CPC classification number: G06N3/084 , G06N3/045 , G06N3/063 , G06N3/10

Abstract: Methods and systems for performing a training operation of a neural network are provided. In one example, a method comprises: performing backward propagation computations for a second layer of a neural network to generate second weight gradients; splitting the second weight gradients into portions; causing a hardware interface to exchange a first portion of the second weight gradients with the second computer system; performing backward propagation computations for a first layer of the neural network to generate first weight gradients when the exchange of the first portion of the second weight gradients is underway, the first layer being a lower layer than the second layer in the neural network; causing the hardware interface to transmit the first weight gradients to the second computer system; and causing the hardware interface to transmit the remaining portions of the second weight gradients to the second computer system.

18.

发明公开
IMPROPER NEURAL NETWORK INPUT DETECTION AND HANDLING 审中-公开

公开(公告)号：US20240020514A1

公开(公告)日：2024-01-18

申请号：US18143970

申请日：2023-05-05

Applicant: Amazon Technologies, Inc.

Inventor： Randy Renfu Huang , Richard John Heaton , Andrea Olgiati , Ron Diamant

IPC: G06N3/045 , G06N3/04 , G06N3/08 , G06F18/214

CPC classification number: G06N3/045 , G06N3/04 , G06N3/08 , G06F18/214

Abstract: Systems and methods for performing improper input data detection are described. In one example, a system comprises: hardware circuits configured to receive input data and to perform computations of a neural network based on the input data to generate computation outputs; and an improper input detection circuit configured to: determine a relationship between the computation outputs of the hardware circuits and reference outputs; determine that the input data are improper based on the relationship; and perform an action based on determining that the input data are improper.

19.

发明授权
Performing hardware operator fusion 有权

公开(公告)号：US11809981B1

公开(公告)日：2023-11-07

申请号：US16698753

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Tobias Joseph Kastulus Edler von Koch , Yizhi Liu , Taemin Kim , Jindrich Zejda , Yida Wang , Vinod Sharma , Richard John Heaton , Randy Renfu Huang

IPC: G06N3/063 , G06F9/30 , G06F9/54

CPC classification number: G06N3/063 , G06F9/30007 , G06F9/545

Abstract: A method of generating executable instructions for a computing system is provided. The method comprises: receiving a first set of instructions including a kernel of a first operator and a kernel of a second operator, the kernel of the first operator including instructions of the first operator and write instructions to a virtual data node, the kernel of the second operator including instructions of the second operator and read instructions to the virtual data node; determining, based on a mapping between the write instructions and read instructions, instructions of data transfer operations between the first operator and the second operator; and generating a second set of instructions representing a fused operator of the first operator and the second operator, the second set of instructions including the instructions of the first operator, the instructions of the second operator, and the instructions of the data transfer operations.

20.

发明授权
Dynamic processing element array expansion 有权

公开(公告)号：US11568238B2

公开(公告)日：2023-01-31

申请号：US16456414

申请日：2019-06-28

Applicant: Amazon Technologies, Inc.

Inventor： Randy Renfu Huang , Ron Diamant , Richard John Heaton

IPC: G06E1/00 , G06E3/00 , G06G7/00 , G06N3/08 , G06N3/04

Abstract: A computer-implemented method includes receiving a neural network model that includes a tensor operation, and dividing the tensor operation into sub-operations. The sub-operations includes at least two sub-operations that have no data dependency between the two sub-operations. The computer-implemented method further includes assigning a first sub-operation in the two sub-operations to a first computing engine, assigning a second sub-operation in the two sub-operations to a second computing engine, and generating instructions for performing, in parallel, the first sub-operation by the first computing engine and the second sub-operation by the second computing engine. An inference is then made based on a result of the first sub-operation, a result of the second sub-operation, or both. The first computing engine and the second computing engine are in a same integrated circuit device or in two different integrated circuit devices.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification