Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Randy Huang"

21.

发明申请
MULTI-MEMORY ON-CHIP COMPUTATIONAL NETWORK 有权

公开(公告)号：US20210019600A1

公开(公告)日：2021-01-21

申请号：US17033573

申请日：2020-09-25

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant

IPC: G06N3/04 , G06F15/80 , G06F13/28 , G06F3/06 , G06F13/40

Abstract: Provided are systems, methods, and integrated circuits for a neural network processing system. In various implementations, the system can include a first array of processing engines coupled to a first set of memory banks and a second array of processing engines coupled to a second set of memory banks. The first and second set of memory banks be storing all the weight values for a neural network, where the weight values are stored before any input data is received. Upon receiving input data, the system performs a task defined for the neural network. Performing the task can include computing an intermediate result using the first array of processing engines, copying the intermediate result to the second set of memory banks, and computing a final result using the second array of processing engines, where the final result corresponds to an outcome of performing the task.

22.

发明申请
ACCELERATED QUANTIZED MULTIPLY-AND-ADD OPERATIONS 审中-公开

公开(公告)号：US20200293284A1

公开(公告)日：2020-09-17

申请号：US16891010

申请日：2020-06-02

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni

IPC: G06F7/544 , G06N3/08 , G06F17/15 , G06N3/063

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.

23.

发明申请
SCHEDULING NETWORK COMPUTATIONS 审中-公开

公开(公告)号：US20190294959A1

公开(公告)日：2019-09-26

申请号：US15933225

申请日：2018-03-22

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Ron Diamant , Thomas A. Volpe , Randy Huang

IPC: G06N3/08

Abstract: Disclosed herein are techniques for scheduling and executing multi-layer neural network computations for multiple contexts. In one embodiment, a method comprises determining a set of computation tasks to be executed, the set of computation tasks including a first computation task and a second computation task, as well as a third computation task and a fourth computation task to provide input data for the first and second computation tasks; determining a first execution batch comprising the first and second computation tasks; determining a second execution batch comprising at least the third computation task to be executed before the first execution batch; determining whether to include the fourth computation task in the second execution batch based on whether the memory device has sufficient capacity to hold input data and output data of both of the third and fourth computation; executing the second execution batch followed by the first execution batch.

24.

发明申请
ACCELERATED QUANTIZED MULTIPLY-AND-ADD OPERATIONS 审中-公开

公开(公告)号：US20190294413A1

公开(公告)日：2019-09-26

申请号：US15934681

申请日：2018-03-23

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni

IPC: G06F7/544 , G06F17/15 , G06N3/08

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

25.

发明申请
ON-CHIP COMPUTATIONAL NETWORK 审中-公开

公开(公告)号：US20190180183A1

公开(公告)日：2019-06-13

申请号：US15839017

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Randy Huang

IPC: G06N3/08 , G06N3/04 , G06N3/06

Abstract: Provided are systems, methods, and integrated circuits for neural network processing. In various implementations, an integrated circuit for neural network processing can include a plurality of memory banks storing weight values for a neural network. The memory banks can be on the same chip as an array of processing engines. Upon receiving input data, the circuit can be configured to use the set of weight values to perform a task defined for the neural network. Performing the task can include reading weight values from the memory banks, inputting the weight values into the array of processing engines, and computing a result using the array of processing engines, where the result corresponds to an outcome of performing the task.

26.

发明申请
MULTI-MEMORY ON-CHIP COMPUTATIONAL NETWORK 审中-公开

公开(公告)号：US20190180170A1

公开(公告)日：2019-06-13

申请号：US15839301

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant

IPC: G06N3/04 , G06F15/80 , G06F13/28 , G06F13/40 , G06F3/06

Abstract: Provided are systems, methods, and integrated circuits for a neural network processing system. In various implementations, the system can include a first array of processing engines coupled to a first set of memory banks and a second array of processing engines coupled to a second set of memory banks. The first and second set of memory banks be storing all the weight values for a neural network, where the weight values are stored before any input data is received. Upon receiving input data, the system performs a task defined for the neural network. Performing the task can include computing an intermediate result using the first array of processing engines, copying the intermediate result to the second set of memory banks, and computing a final result using the second array of processing engines, where the final result corresponds to an outcome of performing the task.

27.

发明申请
FAST CONTEXT SWITCHING FOR COMPUTATIONAL NETWORKS 审中-公开

公开(公告)号：US20190179795A1

公开(公告)日：2019-06-13

申请号：US15839157

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant , Jindrich Zejda , Drazen Borkovic

IPC: G06F15/18 , G06N3/10

Abstract: Provided are systems, methods, and integrated circuits neural network processor that can execute a fast context switch between one neural network and another. In various implementations, a neural network processor can include a plurality of memory banks storing a first set of weight values for a first neural network. When the neural network processor receives first input data, the neural network processor can compute a first result using the first set of weight values and the first input data. While computing the first result, the neural network processor can store, in the memory banks, a second set of weight values for a second neural network. When the neural network processor receives second input data, the neural network processor can compute a second result using the second set of weight values and the second input data, where the computation occurs upon completion of computation of the first result.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification