Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Randy Huang"

1.

发明授权
Circuit architecture with biased randomization 有权

公开(公告)号：US11960997B1

公开(公告)日：2024-04-16

申请号：US17570673

申请日：2022-01-07

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant

IPC: G06N3/08 , G06N7/01

CPC classification number: G06N3/08 , G06N7/01

Abstract: Disclosed herein are techniques for classifying data with a data processing circuit. In one embodiment, the data processing circuit includes a probabilistic circuit configurable to generate a decision at a pre-determined probability, and an output generation circuit including an output node and configured to receive input data and a weight, and generate output data at the output node for approximating a product of the input data and the weight. The generation of the output data includes propagating the weight to the output node according a first decision of the probabilistic circuit. The probabilistic circuit is configured to generate the first decision at a probability determined based on the input data.

2.

发明授权
Flexible weight expansion 有权

公开(公告)号：US11263517B1

公开(公告)日：2022-03-01

申请号：US15908080

申请日：2018-02-28

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Randy Huang

IPC: G06N3/04 , G06N3/063 , G06N3/02 , G06N3/06 , G06N20/00

Abstract: Disclosed herein are techniques for obtain weights for neural network computations. In one embodiment, an integrated circuit may include an arithmetic circuit configured to perform arithmetic operations for a neural network. The integrated circuit may also include a weight processing circuit configured to: acquire data from a memory device; receive configuration information indicating a size of each quantized weight of a set of quantized weights; extract the set of quantized weights from the data based on the size of the each weight indicated by the configuration information; perform de-quantization processing on the set of quantized weights to generate a set of de-quantized weights; and provide the set of de-quantized weights to the arithmetic circuit to enable the arithmetic circuit to perform the arithmetic operations. The memory device may be part of or external to the integrated circuit.

3.

发明授权
Accelerated quantized multiply-and-add operations 有权

公开(公告)号：US10983754B2

公开(公告)日：2021-04-20

申请号：US16891010

申请日：2020-06-02

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni

IPC: G06F7/544 , G06N3/08 , G06F17/15 , G06N3/063 , G06N3/04

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.

4.

发明授权
Accelerated quantized multiply-and-add operations 有权

公开(公告)号：US10678508B2

公开(公告)日：2020-06-09

申请号：US15934681

申请日：2018-03-23

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni

IPC: G06F7/544 , G06N3/08 , G06F17/15 , G06N3/063 , G06N3/04

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

5.

发明授权
Processing for multiple input data sets in a multi-layer neural network 有权

公开(公告)号：US12067492B2

公开(公告)日：2024-08-20

申请号：US18144129

申请日：2023-05-05

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Ron Diamant , Thomas A. Volpe , Randy Huang

IPC: G06F3/06 , G06N3/045 , G06N3/082

CPC classification number: G06N3/082 , G06F3/0604 , G06F3/0644 , G06F3/0673 , G06N3/045

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

6.

发明授权
Executing sublayers of a fully-connected layer 有权

公开(公告)号：US11868878B1

公开(公告)日：2024-01-09

申请号：US15934523

申请日：2018-03-23

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant

IPC: G06N3/08 , G06N5/046 , G06F18/2413 , G06F18/2431

CPC classification number: G06N3/08 , G06F18/2413 , G06F18/2431 , G06N5/046

Abstract: Disclosed herein are techniques for implementing a large fully-connected layer in an artificial neural network. The large fully-connected layer is grouped into multiple fully-connected subnetworks. Each fully-connected subnetwork is configured to classify an object into an unknown class or a class in a subset of target classes. If the object is classified as the unknown class by a fully-connected subnetwork, a next fully-connected subnetwork may be used to further classify the object. In some embodiments, the fully-connected layer is grouped based on a ranking of target classes.

7.

发明授权
Processing for multiple input data sets 有权

公开(公告)号：US11797853B2

公开(公告)日：2023-10-24

申请号：US17951084

申请日：2022-09-22

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Ron Diamant , Thomas A. Volpe , Randy Huang

IPC: G06F3/06 , G06N3/082 , G06N3/045

CPC classification number: G06N3/082 , G06F3/0604 , G06F3/0644 , G06F3/0673 , G06N3/045

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

8.

发明授权
Scheduling neural network computations based on memory capacity 有权

公开(公告)号：US11461631B2

公开(公告)日：2022-10-04

申请号：US15933225

申请日：2018-03-22

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Ron Diamant , Thomas A. Volpe , Randy Huang

IPC: G06N3/08

Abstract: Disclosed herein are techniques for scheduling and executing multi-layer neural network computations for multiple contexts. In one embodiment, a method comprises determining a set of computation tasks to be executed, the set of computation tasks including a first computation task and a second computation task, as well as a third computation task and a fourth computation task to provide input data for the first and second computation tasks; determining a first execution batch comprising the first and second computation tasks; determining a second execution batch comprising at least the third computation task to be executed before the first execution batch; determining whether to include the fourth computation task in the second execution batch based on whether the memory device has sufficient capacity to hold input data and output data of both of the third and fourth computation; executing the second execution batch followed by the first execution batch.

9.

发明授权
Multi-memory on-chip computational network 有权

公开(公告)号：US10803379B2

公开(公告)日：2020-10-13

申请号：US15839301

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant

IPC: G06F15/80 , G06N3/04 , G06F13/28 , G06F3/06 , G06F13/40

Abstract: Provided are systems, methods, and integrated circuits for a neural network processing system. In various implementations, the system can include a first array of processing engines coupled to a first set of memory banks and a second array of processing engines coupled to a second set of memory banks. The first and second set of memory banks be storing all the weight values for a neural network, where the weight values are stored before any input data is received. Upon receiving input data, the system performs a task defined for the neural network. Performing the task can include computing an intermediate result using the first array of processing engines, copying the intermediate result to the second set of memory banks, and computing a final result using the second array of processing engines, where the final result corresponds to an outcome of performing the task.

10.

发明公开
PROCESSING FOR MULTIPLE INPUT DATA SETS 审中-公开

公开(公告)号：US20230351186A1

公开(公告)日：2023-11-02

申请号：US18144129

申请日：2023-05-05

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Ron Diamant , Thomas A. Volpe , Randy Huang

IPC: G06N3/08 , G06F3/06

CPC classification number: G06N3/082 , G06F3/0604 , G06F3/0644 , G06F3/0673 , G06N3/045

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification