-
公开(公告)号:US10817260B1
公开(公告)日:2020-10-27
申请号:US16007749
申请日:2018-06-13
Applicant: Amazon Technologies, Inc.
Inventor: Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni , Thomas A. Volpe
Abstract: Systems and methods are provided to skip multiplication operations with zeros in processing elements of the systolic array to reduce dynamic power consumption. A value of zero can be detected on an input data element entering each row of the array and respective zero indicators may be generated. These respective zero indicators may be passed to all the processing elements in the respective rows. The multiplication operation with the zero value can be skipped in each processing element based on the zero indicators, thus reducing dynamic power consumption.
-
公开(公告)号:US10803007B1
公开(公告)日:2020-10-13
申请号:US16146834
申请日:2018-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Thomas A. Volpe , Nafea Bshara , Raymond Scott Whiteside , Ron Diamant
Abstract: Provided are integrated circuit devices and methods for operating integrated circuit devices. In various examples, an integrated circuit device can include a memory for storing instructions a configuration register, and an instruction execution circuit. An instruction read from the memory can be a reconfigurable instruction. which includes a set of fields corresponding to a plurality of operations. Values in the fields can determine whether the operations are enabled or disabled. For example, a first value in a first field can enable a first operation. Whether the first operation is performed can further be determined by comparing a second value in a second field to a third value read from the configuration register. The value set in the configuration register thus can control whether the operation is performed.
-
公开(公告)号:US10740432B1
公开(公告)日:2020-08-11
申请号:US16219604
申请日:2018-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Randy Renfu Huang , Mohammad El-Shabani , Sundeep Amirineni , Kenneth Wayne Patton , Willis Wang
Abstract: Methods and systems for performing hardware computations of mathematical functions are provided. In one example, a system comprises a mapping table that maps each base value of a plurality of base values to parameters related to a mathematical function; a selection module configured to select, based on an input value, a first base value and first parameters mapped to the first base value in the mapping table; and arithmetic circuits configured to: receive, from the mapping table, the first base value and the first plurality of parameters; and compute, based on a relationship between the input value and the first base value, and based on the first parameters, an estimated output value of the mathematical function for the input value.
-
公开(公告)号:US10708241B1
公开(公告)日:2020-07-07
申请号:US16275824
申请日:2019-02-14
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Ron Diamant , Nafea Bshara , Leah Shalev , Erez Izenberg
Abstract: A hardware security accelerator includes a configurable parser that is configured to receive a packet and to extract from the packet headers associated with a set of protocols. The security accelerator also includes a packet type detection unit to determine a type of the packet in response to the set of protocols and to generate a packet type identifier indicative of the type of the packet. A configurable security unit includes a configuration unit and a configurable security engine. The configuration unit configures the configurable security engine according to the type of the packet and to content of at least one of the headers extracted from the packet. The configurable security engine performs security processing of the packet to provide at least one security result.
-
公开(公告)号:US10678479B1
公开(公告)日:2020-06-09
申请号:US16204943
申请日:2018-11-29
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Randy Renfu Huang , Sundeep Amirineni , Jeffrey T. Huynh
Abstract: Provided are integrated circuits and methods for operating integrated circuits. An integrated circuit can include a plurality of memory banks and an execution engine including a set of execution components. Each execution component can be associated with a respective memory bank, and can read from and write to only the respective memory bank. The integrated circuit can further include a set of registers each associated with a respective memory bank from the plurality of memory banks. The integrated circuit can further be operable to load to or store from the set of registers in parallel, and load to or store from the set of registers serially. A parallel operation followed by a serial operation enables data to be moved from many memory banks into one memory bank. A serial operation followed by a parallel operation enables data to be moved from one memory bank into many memory banks.
-
公开(公告)号:US10579591B1
公开(公告)日:2020-03-03
申请号:US15385740
申请日:2016-12-20
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Andrea Olgiati , Nathan Binkert
Abstract: Techniques for performing incremental block compression using a processor are described herein. The processor receives a request to compress input data, the request including compression parameters for the compression and a target block size. The processor divides the input data into portions. The processor iteratively compresses the input data to an output block, until compressing another portion of data would increase a file size of the output block over a threshold value that is based at least on the target block size.
-
117.
公开(公告)号:US10559345B1
公开(公告)日:2020-02-11
申请号:US16191356
申请日:2018-11-14
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Jonathan Cohen , Elad Valfer
Abstract: A decoder is disclosed that is used to select an area of address space in an Integrated Circuit. The decoder uses a hardware shifting module that performs shift operations on constants. Such a structure reduces an overall area consumption of the shifting module. Additionally, the decoder can perform a multi-bit shift operation in a single clock cycle.
-
公开(公告)号:US10445638B1
公开(公告)日:2019-10-15
申请号:US15908236
申请日:2018-02-28
Applicant: Amazon Technologies, Inc.
Inventor: Sundeep Amirineni , Ron Diamant , Randy Huang , Thomas A. Volpe
Abstract: Disclosed herein are techniques for performing neural network computations. In one embodiment, an apparatus may include an array of processing elements, the array having a configurable first effective dimension and a configurable second effective dimension. The apparatus may also include a controller configured to determine at least one of: a first number of input data sets to be provided to the array at the first time or a second number of output data sets to be generated by the array at the second time, and to configure, based on at least one of the first number or the second number, at least one of the first effective dimension or the second effective dimension of the array.
-
公开(公告)号:US10387350B1
公开(公告)日:2019-08-20
申请号:US15851450
申请日:2017-12-21
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Ron Diamant , Ori Weber , Omer Shaked
Abstract: A configurable sponge function engine. The configurable engine includes a register having bitrate and capacity sections, each having a variable size, where a sum of the bitrate and capacity sizes is fixed. A controller generates a bitrate size indication. A configurable message processor receives an input message from an input bus, receives the size indication, fragments the input message into fragmented blocks of a size specified by the size indication, and converts the blocks to a bus width of the bitrate and capacity sizes. An iterative calculator receives the blocks, performs iterative processing operations on the blocks, and stores a result of each operation in the register overwriting a previous register value. An output adaptor receives a value stored in the register after the block corresponding to the end of the input message is processed and outputs the register value converted to have an output bus width.
-
公开(公告)号:US10366026B1
公开(公告)日:2019-07-30
申请号:US15390250
申请日:2016-12-23
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Ron Diamant , Andrea Olgiati , Nathan Binkert
Abstract: A system comprises a data storage, a decompression accelerator configured to decompress compressed data and thereby generate decompressed data, and a direct memory access (DMA) engine coupled to the data storage and the decompression accelerator. The DMA engine comprises a buffer for storage of a plurality of descriptors containing configuration parameters for a block of compressed data to be retrieved from the data storage and decompressed by the decompression accelerator, wherein at least one of the descriptors comprises a threshold value. The DMA engine, in accordance with one or more of the descriptors, is configured to read compressed data from data storage and transmit the threshold value and the compressed data to the decompression accelerator. The decompression accelerator is configured to decompress the compressed data until the threshold value is reached and then to abort further data decompression and to assert a stop transaction signal to the DMA engine.
-
-
-
-
-
-
-
-
-