Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Jindrich Zejda"

11.

发明申请
HIERARCHICAL PARTITIONING OF OPERATORS 有权

公开(公告)号：US20210158131A1

公开(公告)日：2021-05-27

申请号：US16698236

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang

IPC: G06N3/063 , G06N3/04

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

12.

发明授权
Neural network operation reordering for parallel execution 有权

公开(公告)号：US11016775B2

公开(公告)日：2021-05-25

申请号：US16453478

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey T. Huynh , Drazen Borkovic , Jindrich Zejda , Randy Renfu Huang , Ron Diamant

IPC: G06F8/00 , G06F9/38 , G06F9/50 , G06N3/04 , G06N3/08

Abstract: Techniques are disclosed for reordering operations of a neural network to improve runtime efficiency. In some examples, a compiler receives a description of the neural network comprising a plurality of operations. The compiler may determine which execution engine of a plurality of execution engines is to perform each of the plurality of operations. The compiler may determine an order of performance associated with the plurality of operations. The compiler may identify a runtime inefficiency based on the order of performance and a hardware usage for each of the plurality of operations. An operation may be reordered to reduce the runtime inefficiency. Instructions may be compiled based on the plurality of operations, which include the reordered operation.

13.

发明授权
Synchronization of concurrent computation engines 有权

公开(公告)号：US10922146B1

公开(公告)日：2021-02-16

申请号：US16219530

申请日：2018-12-13

Applicant: Amazon Technologies, Inc.

Inventor： Ilya Minkin , Ron Diamant , Drazen Borkovic , Jindrich Zejda , Dana Michelle Vantrease

IPC: G06F9/46 , G06F9/52 , G06F8/41 , G06F9/30 , G06N3/063

Abstract: Systems and methods are provided for synchronizing execution of program code for an integrated circuit device having multiple concurrently operating execution engines, where the operation of one execution engine may be dependent on the operation of another execution engine. Data or resource dependencies may be accommodated with a Set instruction to cause a first execution engine to set a register value and a Wait instruction to cause a second execution engine to wait for a condition associate with the register value. Concurrently operation of the execution engines may thus be synchronized.

14.

发明授权
Neural network operation reordering for parallel execution 有权

公开(公告)号：US11567778B2

公开(公告)日：2023-01-31

申请号：US17243415

申请日：2021-04-28

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey T. Huynh , Drazen Borkovic , Jindrich Zejda , Randy Renfu Huang , Ron Diamant

IPC: G06F9/44 , G06F9/38 , G06F9/50 , G06N3/04 , G06N3/08

Abstract: Techniques are disclosed for reordering operations of a neural network to improve runtime efficiency. In some examples, a compiler receives a description of the neural network comprising a plurality of operations. The compiler may determine which execution engine of a plurality of execution engines is to perform each of the plurality of operations. The compiler may determine an order of performance associated with the plurality of operations. The compiler may identify a runtime inefficiency based on the order of performance and a hardware usage for each of the plurality of operations. An operation may be reordered to reduce the runtime inefficiency. Instructions may be compiled based on the plurality of operations, which include the reordered operation.

15.

发明授权
Allocation and placement of resources for network computation 有权

公开(公告)号：US11561833B1

公开(公告)日：2023-01-24

申请号：US16021866

申请日：2018-06-28

Applicant: Amazon Technologies, Inc.

Inventor： Richard John Heaton , Randy Renfu Huang , Drazen Borkovic , Jindrich Zejda

IPC: G06F9/50 , G06F9/54 , G06F9/38 , G06N3/04

Abstract: Techniques for operating a computing system to perform neural network operations are disclosed. In one example, a method comprises receiving a neural network model, determining a sequence of neural network operations based on data dependency in the neural network model, and determining a set of instructions to map the sequence of neural network operations to the processing resources of the neural network processor. The method further comprises determining, based on a set of memory access operations included in the set of instructions, a first set of memory references associated with a first location of an external memory to store the input data and a second set of memory references associated with a second location of the external memory to store the output data, and generating an instruction file including the set of instructions, the first set of memory references and the second set of memory references.

16.

发明申请
FAST CONTEXT SWITCHING FOR COMPUTATIONAL NETWORKS 审中-公开

公开(公告)号：US20190179795A1

公开(公告)日：2019-06-13

申请号：US15839157

申请日：2017-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant , Jindrich Zejda , Drazen Borkovic

IPC: G06F15/18 , G06N3/10

Abstract: Provided are systems, methods, and integrated circuits neural network processor that can execute a fast context switch between one neural network and another. In various implementations, a neural network processor can include a plurality of memory banks storing a first set of weight values for a first neural network. When the neural network processor receives first input data, the neural network processor can compute a first result using the first set of weight values and the first input data. While computing the first result, the neural network processor can store, in the memory banks, a second set of weight values for a second neural network. When the neural network processor receives second input data, the neural network processor can compute a second result using the second set of weight values and the second input data, where the computation occurs upon completion of computation of the first result.

Patent Agency Ranking