Patent search ap:("QUALCOMM Incorporated") AND inv:"Michael GOLDFARB" Page 1

1.

发明申请
NEURAL PROCESSING UNIT (NPU) DIRECT MEMORY ACCESS (NDMA) MEMORY BANDWIDTH OPTIMIZATION 有权

公开(公告)号：US20220230058A1

公开(公告)日：2022-07-21

申请号：US17713176

申请日：2022-04-04

Applicant: Qualcomm Incorporated

Inventor： Jinxia BAI , Rosario CAMMAROTA , Michael GOLDFARB

IPC: G06N3/063 , G06F13/28 , G06F15/78

Abstract: A neural processing unit (NPU) is described. The NPU includes an NPU direct memory access (NDMA) core. The NDMA core includes a read engine having a read buffer. The NDMA core also includes a write engine having a write buffer. The NPU also includes a controller. The controller is configured to direct the NDMA core to perform hardware memory bandwidth optimization for reading/writing NDMA data in the read buffer and/or NDMA data in the write buffer. The NDMA core is also configured to transparently combine NDMA transaction requests for a data stripe to increase local access to available tensors in artificial neural networks.

2.

发明公开
EXPLOITING ACTIVATION SPARSITY IN DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20230185532A1

公开(公告)日：2023-06-15

申请号：US18105159

申请日：2023-02-02

Applicant: QUALCOMM Incorporated

Inventor： Rexford Alan HILL , Aaron Douglass LAMB , Michael GOLDFARB , Amin ANSARI , Christopher LOTT

IPC: G06F7/544 , G06F5/06

CPC classification number: G06F7/5443 , G06F5/06 , G06N3/063

Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.

3.

发明申请
OPTIMIZING PERFORMANCE OF RECURRENT NEURAL NETWORKS 审中-公开

公开(公告)号：US20190325289A1

公开(公告)日：2019-10-24

申请号：US15956674

申请日：2018-04-18

Applicant: QUALCOMM Incorporated

Inventor： Rosario CAMMAROTA , Michael GOLDFARB , Manu RASTOGI , Sarang OZARDE

IPC: G06N3/04 , G06N5/04

Abstract: An apparatus for optimizing a computational network is configure to receive an input at a first processing component. The first processing component may include at least a first programmable processing component and a second programmable processing component. The first programmable processing component is configured to compute a first nonlinear function and the second programmable processing component is configured to compute a second nonlinear function which is different than the second nonlinear function. The computational network which may be a recurrent neural network such as a long short-term memory may be operated to generate an inference based at least in part on outputs of the first programmable processing component and the second programmable processing component.

Patent Agency Ranking