-
公开(公告)号:US11537853B1
公开(公告)日:2022-12-27
申请号:US16455258
申请日:2019-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Tariq Afzal , Arvind Mandhani
Abstract: Described herein is a neural network accelerator (NNA) with a decompression unit that can be configured to perform multiple types of decompression. The decompression may include a separate subunit for each decompression type. The subunits can be coupled to form a pipeline in which partially decompressed results generated by one subunit are input for further decompression by another subunit. Depending on which types of compression were applied to incoming data, any number of the subunits may be used to produce a decompressed output. In some embodiments, the decompression unit is configured to decompress data that has been compressed using a zero value compression scheme, a shared value compression scheme, or both. The NNA can also include a compression unit implemented in a manner similar to that of the decompression unit.
-
公开(公告)号:US12169786B1
公开(公告)日:2024-12-17
申请号:US16455334
申请日:2019-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Tariq Afzal , Arvind Mandhani , Shiva Navab
Abstract: Described herein is a neural network accelerator (NNA) with reconfigurable memory resources for forming a set of local memory buffers comprising at least one activation buffer, at least one weight buffer, and at least one output buffer. The NNA supports a plurality of predefined memory configurations that are optimized for maximizing throughput and reducing overall power consumption in different types of neural networks. The memory configurations differ with respect to at least one of a total amount of activation, weight, or output buffer memory, or a total number of activation, weight, or output buffers. Depending on which type of neural network is being executed and the memory behavior of the specific neural network, a memory configuration can be selected accordingly.
-
公开(公告)号:US11868867B1
公开(公告)日:2024-01-09
申请号:US17989340
申请日:2022-11-17
Applicant: Amazon Technologies, Inc.
Inventor: Tariq Afzal , Arvind Mandhani
Abstract: Described herein is a neural network accelerator (NNA) with a decompression unit that can be configured to perform multiple types of decompression. The decompression may include a separate subunit for each decompression type. The subunits can be coupled to form a pipeline in which partially decompressed results generated by one subunit are input for further decompression by another subunit. Depending on which types of compression were applied to incoming data, any number of the subunits may be used to produce a decompressed output. In some embodiments, the decompression unit is configured to decompress data that has been compressed using a zero value compression scheme, a shared value compression scheme, or both. The NNA can also include a compression unit implemented in a manner similar to that of the decompression unit.
-
公开(公告)号:US11520561B1
公开(公告)日:2022-12-06
申请号:US16455551
申请日:2019-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Tariq Afzal
Abstract: Described herein is a neural network accelerator with a set of neural processing units and an instruction set for execution on the neural processing units. The instruction set is a compact instruction set including various compute and data move instructions for implementing a neural network. Among the compute instructions are an instruction for performing a fused operation comprising sequential computations, one of which involves matrix multiplication, and an instruction for performing an elementwise vector operation. The instructions in the instruction set are highly configurable and can handle data elements of variable size. The instructions also implement a synchronization mechanism that allows asynchronous execution of data move and compute operations across different components of the neural network accelerator as well as between multiple instances of the neural network accelerator.
-
-
-