Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Tariq Afzal"

1.

发明授权
Decompression and compression of neural network data using different compression schemes 有权

公开(公告)号：US11537853B1

公开(公告)日：2022-12-27

申请号：US16455258

申请日：2019-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Tariq Afzal , Arvind Mandhani

IPC: G06N3/04 , H03M7/30 , G06N5/04 , H03M7/42

Abstract: Described herein is a neural network accelerator (NNA) with a decompression unit that can be configured to perform multiple types of decompression. The decompression may include a separate subunit for each decompression type. The subunits can be coupled to form a pipeline in which partially decompressed results generated by one subunit are input for further decompression by another subunit. Depending on which types of compression were applied to incoming data, any number of the subunits may be used to produce a decompressed output. In some embodiments, the decompression unit is configured to decompress data that has been compressed using a zero value compression scheme, a shared value compression scheme, or both. The NNA can also include a compression unit implemented in a manner similar to that of the decompression unit.

2.

发明授权
Neural network accelerator with reconfigurable memory 有权

公开(公告)号：US12169786B1

公开(公告)日：2024-12-17

申请号：US16455334

申请日：2019-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Tariq Afzal , Arvind Mandhani , Shiva Navab

IPC: G06N3/10 , G06N3/02 , G06N3/08

Abstract: Described herein is a neural network accelerator (NNA) with reconfigurable memory resources for forming a set of local memory buffers comprising at least one activation buffer, at least one weight buffer, and at least one output buffer. The NNA supports a plurality of predefined memory configurations that are optimized for maximizing throughput and reducing overall power consumption in different types of neural networks. The memory configurations differ with respect to at least one of a total amount of activation, weight, or output buffer memory, or a total number of activation, weight, or output buffers. Depending on which type of neural network is being executed and the memory behavior of the specific neural network, a memory configuration can be selected accordingly.

3.

发明授权
Decompression and compression of neural network data using different compression schemes 有权

公开(公告)号：US11868867B1

公开(公告)日：2024-01-09

申请号：US17989340

申请日：2022-11-17

Applicant: Amazon Technologies, Inc.

Inventor： Tariq Afzal , Arvind Mandhani

IPC: G06N3/04 , H03M7/30 , G06N5/04 , H03M7/42 , G06N3/048

CPC classification number: G06N3/048 , G06N5/04 , H03M7/42 , H03M7/702

Abstract: Described herein is a neural network accelerator (NNA) with a decompression unit that can be configured to perform multiple types of decompression. The decompression may include a separate subunit for each decompression type. The subunits can be coupled to form a pipeline in which partially decompressed results generated by one subunit are input for further decompression by another subunit. Depending on which types of compression were applied to incoming data, any number of the subunits may be used to produce a decompressed output. In some embodiments, the decompression unit is configured to decompress data that has been compressed using a zero value compression scheme, a shared value compression scheme, or both. The NNA can also include a compression unit implemented in a manner similar to that of the decompression unit.

4.

发明授权
Neural network accelerator with compact instruct set 有权

公开(公告)号：US11520561B1

公开(公告)日：2022-12-06

申请号：US16455551

申请日：2019-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Tariq Afzal

IPC: G06F7/544 , G06F17/16 , G06N3/10 , G06F7/57 , G06F5/01

Abstract: Described herein is a neural network accelerator with a set of neural processing units and an instruction set for execution on the neural processing units. The instruction set is a compact instruction set including various compute and data move instructions for implementing a neural network. Among the compute instructions are an instruction for performing a fused operation comprising sequential computations, one of which involves matrix multiplication, and an instruction for performing an elementwise vector operation. The instructions in the instruction set are highly configurable and can handle data elements of variable size. The instructions also implement a synchronization mechanism that allows asynchronous execution of data move and compute operations across different components of the neural network accelerator as well as between multiple instances of the neural network accelerator.

Patent Agency Ranking