专利检索 ap:"Umer Iftikhar Cheema" 第 1 页

1.

发明公开
APPROXIMATING ACTIVATION FUNCTIONS WITH TAYLOR SERIES 审中-公开

公开(公告)号：US20230351181A1

公开(公告)日：2023-11-02

申请号：US18346992

申请日：2023-07-05

申请人： Umer Iftikhar Cheema , Deepak Abraham Mathaikutty , Arnab Raha , Dinakar Kondru , Raymond Jit-Hung Sung , Soumendu Kumar Ghosh

发明人： Umer Iftikhar Cheema , Deepak Abraham Mathaikutty , Arnab Raha , Dinakar Kondru , Raymond Jit-Hung Sung , Soumendu Kumar Ghosh

IPC分类号： G06N3/08 , G06N3/04

CPC分类号： G06N3/08 , G06N3/04

摘要： An activation function unit can compute activation functions approximated by Taylor series. The activation function unit may include a plurality of compute elements. Each compute element may include two multipliers and an accumulator. The first multiplier may compute intermediate products using an activation, such as an output activation of a DNN layer. The second multiplier may compute terms of Taylor series approximating an activation function based on the intermediate products from the first multiplier and coefficients of the Taylor series. The accumulator may compute a partial sum of the terms as an output of the activation function. The number of the terms may be determined based on a predetermined accuracy of the output of the activation function. The activation function unit may process multiple activations. Different activations may be input into different compute elements in different clock cycles. The activation function unit may compute activation functions with different accuracies.

2.

发明公开
DYNAMIC SPARSITY-BASED ACCELERATION OF NEURAL NETWORKS 审中-公开

公开(公告)号：US20240119269A1

公开(公告)日：2024-04-11

申请号：US18543356

申请日：2023-12-18

申请人： Arnab Raha , Dinakar Kondru , Deepak Abraham Mathaikutty , Umer Iftikhar Cheema

发明人： Arnab Raha , Dinakar Kondru , Deepak Abraham Mathaikutty , Umer Iftikhar Cheema

IPC分类号： G06N3/048

CPC分类号： G06N3/048

摘要： A deep neural network (DNN) accelerator may facilitate dynamic sparsity-based acceleration and operate in various sparsity modes including a combined sparsity mode, a weight sparsity mode, an activation sparsity mode, and a dense mode. The DNN accelerator may receive a configuration parameter indicating whether to accelerate the layer based on sparsity in a weight tensor of the layer. The configuration parameter may be generated offline, e.g., before the execution of the DNN is started. The DNN accelerator computes one or more activations of the layer in a previous layer in the DNN. The one or more activations are one or more elements of an activation tensor of the layer. The DNN accelerator may determine a sparsity mode for the layer based on the configuration parameter and sparsity in the activation tensor. One or more sparse cells in the DNN accelerator may execute the layer in the sparsity mode.

3.

发明公开
SWITCHABLE ONE-SIDED SPARSITY ACCELERATION 审中-公开

公开(公告)号：US20240028895A1

公开(公告)日：2024-01-25

申请号：US18476594

申请日：2023-09-28

申请人： Arnab Raha , Deepak Abraham Mathaikutty , Dinakar Kondru , Umer Iftikhar Cheema , Martin Power , Niall Hanrahan

发明人： Arnab Raha , Deepak Abraham Mathaikutty , Dinakar Kondru , Umer Iftikhar Cheema , Martin Power , Niall Hanrahan

IPC分类号： G06N3/08 , G06N3/0464

CPC分类号： G06N3/08 , G06N3/0464

摘要： A load module in a deep neural network (DNN) accelerator may receive a configuration parameter indicating a selection between an activation sparsity mode and a weight sparsity mode. The load module may read a sparse activation tensor, an activation sparsity bitmap, a sparse weight tensor, and a weight sparsity bitmap from a memory. The load module may densify one of the compressed tensors based on the sparsity mode and leave the other compressed tensor as is. The load module may load the dense tensor and the sparse tensor to a sparse cell. The sparse cell includes a sparsity module that may select one or more elements of the dense tensor based on the sparsity bitmap of the sparse tensor. The sparse cell also includes multiply-accumulate (MAC) units that perform MAC operation on the selected elements and the sparse tensor. MAC operations on unselected elements of the dense tensor are skipped.

4.

发明申请
DEEP NEURAL NETWORK (DNN) ACCELERATORS WITH HETEROGENEOUS TILING 有权

公开(公告)号：US20230140173A1

公开(公告)日：2023-05-04

申请号：US17820900

申请日：2022-08-19

申请人： Arnab Raha , Umer Iftikhar Cheema , Praveen Kumar Gupta , Deepak Abraham Mathaikutty , Raymond Jit-Hung Sung

发明人： Arnab Raha , Umer Iftikhar Cheema , Praveen Kumar Gupta , Deepak Abraham Mathaikutty , Raymond Jit-Hung Sung

IPC分类号： G06N3/08

摘要： An DNN accelerator includes one or more heterogenous tile sets. A heterogenous tile set includes tiles of different sizes, e.g., PE arrays including different numbers of columns or rows. The DNN accelerator may identify a tile set from the tile sets for running a DNN model based on dimensions of output tensors convolutional layers in the DNN. Within the selected tile set, a tile may be selected for a convolutional layer in the DNN, e.g., based on dimensions of the output tensor of the convolutional layer and the size of the tile. After the tile is selected, the workload for running a convolutional operation of the layer may be partitioned and assigned to individual PEs in the tile by partitioning the output tensor into output tensor segments. The workload of computing an individual output tensor segment can be assigned to an individual PE in the tile.

5.

发明公开
DYNAMIC UNCOMPRESSION FOR CHANNEL-SEPARABLE OPERATION IN NEURAL NETWORK 审中-公开

公开(公告)号：US20230221994A1

公开(公告)日：2023-07-13

申请号：US18184921

申请日：2023-03-16

申请人： Arnab Raha , Deepak Abraham Mathaikutty , Raymond Jit-Hung Sung , Umer Iftikhar Cheema , Dinakar Kondru , Soumendu Kumar Ghosh

发明人： Arnab Raha , Deepak Abraham Mathaikutty , Raymond Jit-Hung Sung , Umer Iftikhar Cheema , Dinakar Kondru , Soumendu Kumar Ghosh

IPC分类号： G06F9/50 , G06F9/54

CPC分类号： G06F9/5027 , G06F9/54

摘要： A compute block can dynamically uncompress compressed data for executing a channel-separable operation. The compressed data includes one or more nonzero-valued data elements. The compressed data may be stored in a datastore along with a sparsity bitmap of an input operand including the compressed data. An uncompressing module may determine whether the input operand includes any zero-valued data element, e.g., by determining whether the sparsity bitmap includes a zero-valued bit. After determining that the sparsity bitmap includes a zero-valued bit, the uncompressing module inserts a zero-valued data element into the compressed data based on a position of the bit in the sparsity bitmap and generates uncompressed data and update the sparsity bitmap so that all the bits become ones. The uncompressed dense data is transmitted to one or more processing elements (PE) in the compute block for computing an output operand based on the uncompressed dense data.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类