Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Vignesh Vivekraja"

21.

发明授权
Transposed convolution using systolic array 有权

公开(公告)号：US11681902B2

公开(公告)日：2023-06-20

申请号：US16586764

申请日：2019-09-27

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey T Huynh , Vignesh Vivekraja

IPC: G06F7/50 , G06F7/523 , G06F7/78 , G06F9/50 , G06F17/15 , G06F7/544 , G06N3/063

CPC classification number: G06N3/063 , G06F7/50 , G06F7/523 , G06F7/5443 , G06F7/78 , G06F9/5027 , G06F17/153

Abstract: In one example, a neural network accelerator can execute a set of instructions to: load a first weight data element from a memory into a systolic array, the first weight data element having first coordinates; extract, from the instructions, information indicating a first subset of input data elements to be obtained from the memory, the first subset being based on a stride of a transposed convolution operation and second coordinates of first weight data element in a rotated array of weight data elements; based on the information, obtain the first subset of input data elements from the memory; load the first subset of input data elements into the systolic array; and control the systolic array to perform first computations based on the first weight data element and the first subset of input data elements to generate output data elements of an array of output data elements.

22.

发明授权
Low latency neural network model loading 有权

公开(公告)号：US11182314B1

公开(公告)日：2021-11-23

申请号：US16698761

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Drazen Borkovic , Ilya Minkin , Vignesh Vivekraja , Richard John Heaton , Randy Renfu Huang

IPC: G06F13/20 , G06F13/10 , G06N3/04

Abstract: An integrated circuit device implementing a neural network accelerator may have a peripheral bus interface to interface with a host memory, and neural network models can be loaded from the host memory onto the state buffer of the neural network accelerator for execution by the array of processing elements. The neural network accelerator may also have a memory interface to interface with a local memory. The local memory may store neural network models from the host memory, and the models can be loaded from the local memory into the state buffer with reduced latency as compared to loading from the host memory. In systems with multiple accelerators, the models in the local memory can also be shared amongst different accelerators.

Patent Agency Ranking