Patent search ap:("QUALCOMM Incorporated") AND inv:"Karamvir CHATHA" Page 1

1.

发明申请
ZERO OVERHEAD LOOP EXECUTION IN DEEP LEARNING ACCELERATORS 审中-公开

公开(公告)号：US20190303156A1

公开(公告)日：2019-10-03

申请号：US15942344

申请日：2018-03-30

Applicant: QUALCOMM Incorporated

Inventor： Amrit PANDA , Francisco PEREZ , Karamvir CHATHA

IPC: G06F9/32 , G06F9/30 , G06F9/38 , G06N3/063

Abstract: An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.

2.

发明申请
LOW-POWER ARCHITECTURE FOR SPARSE NEURAL NETWORK 审中-公开

公开(公告)号：US20180164866A1

公开(公告)日：2018-06-14

申请号：US15377858

申请日：2016-12-13

Applicant: QUALCOMM Incorporated

Inventor： Yatish Girish TURAKHIA , Javid JAFFARI , Amrit PANDA , Karamvir CHATHA

IPC: G06F1/32 , G06N3/02

CPC classification number: G06F1/3206 , G06N3/02 , G06N3/0454 , G06N3/063

Abstract: A method, a computer-readable medium, and an apparatus for reducing power consumption of a neural network are provided. The apparatus may retrieve, from a tag storage, at least one tag value of a first tag value for a weight in the neural network or a second tag value for an activation in the neural network. The first tag value may indicate whether the weight is zero and the second tag value may indicate whether the activation is zero. The weight and the activation are to be loaded to a multiplier of a multiplier-accumulator unit as a pair of operands. The apparatus may determine whether the at least one tag value indicates a zero value. The apparatus may disable loading the weight and the activation to the multiplier when the at least one tag value indicates a zero value. The apparatus may disable updating of zero-value activations.

3.

发明申请
ADAPTIVE QUANTIZATION FOR EXECUTION OF MACHINE LEARNING MODELS 有权

公开(公告)号：US20210279635A1

公开(公告)日：2021-09-09

申请号：US16810123

申请日：2020-03-05

Applicant: QUALCOMM Incorporated

Inventor： Serag GADELRAB , Karamvir CHATHA , Ofer ROSENBERG

IPC: G06N20/00 , G06N5/04 , G06F11/34

Abstract: Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.

4.

发明申请
INSTRUCTION SET FOR MINIMIZING CONTROL VARIANCE OVERHEAD IN DATAFLOW ARCHITECTURES 审中-公开

公开(公告)号：US20200089497A1

公开(公告)日：2020-03-19

申请号：US16134945

申请日：2018-09-18

Applicant: QUALCOMM Incorporated

Inventor： Rakesh KOMURAVELLI , Amin ANSARI , Ramesh Chandra CHAUHAN , Karamvir CHATHA

IPC: G06F9/30 , G06N5/02

Abstract: Systems and methods for of minimizing control variance overhead in a dataflow processor include receiving a generating instruction specifying at least an acknowledge predicate based on a first number, a second number, and a first value, wherein a true branch comprises the first number of consumer instructions of the generating instruction based on the first value, used as a first predicate, being true; and a false branch comprises a second number of consumer instructions of the generating instruction based on the first value, used as the first predicate, being false. The acknowledge predicate is evaluated to be a selected number, which is the first number if the first value is true, or the second number if the first value is false. The generating instruction is fired upon the selected number of acknowledge arcs being received from the true branch or the false branch.

5.

发明申请
ARCHITECTURE FOR SPARSE NEURAL NETWORK ACCELERATION 审中-公开

公开(公告)号：US20180189056A1

公开(公告)日：2018-07-05

申请号：US15393670

申请日：2016-12-29

Applicant: QUALCOMM Incorporated

Inventor： Yatish Girish TURAKHIA , Javid JAFFARI , Amrit PANDA , Karamvir CHATHA

IPC: G06F9/30

CPC classification number: G06F9/3001 , G06N3/0454 , G06N3/063

Abstract: A method, a computer-readable medium, and an apparatus for a sparse neural network are provided. The apparatus may include a hardware accelerator. The apparatus may determine, for each pair of operands to be processed by a MAR unit, whether both operands of the pair are non-zero. The apparatus may prevent a pair of operands to be processed by the MAR unit from being loaded to a multiplier of the MAR unit when an operand of the pair of operands is zero. The apparatus may place the pair of operands into one of a plurality of queues when both operands of the pair of operands are non-zero.

6.

发明申请
APPROXIMATION OF NON-LINEAR FUNCTIONS IN FIXED POINT USING LOOK-UP TABLES 审中-公开

公开(公告)号：US20180060278A1

公开(公告)日：2018-03-01

申请号：US15255015

申请日：2016-09-01

Applicant: QUALCOMM Incorporated

Inventor： Dexu LIN , Edward LIAO , Somdeb MAJUMDAR , Aaron LAMB , Karamvir CHATHA

IPC: G06F17/17

CPC classification number: G06F17/17 , G06F7/544 , G06F2207/5354

Abstract: Computing a non-linear function ƒ(x) in hardware or embedded systems can be complex and resource intensive. In one or more aspects of the disclosure, a method, a computer-readable medium, and an apparatus are provided for computing a non-linear function ƒ(x) accurately and efficiently in hardware using look-up tables (LUTs) and interpolation or extrapolation. The apparatus may be a processor. The processor computes a non-linear function ƒ(x) for an input variable x, where ƒ(x)=g(y(x),z(x)). The processor determines an integer n by determining a position of a most significant bit (MSB) of an input variable x. In addition, the processor determines a value for y(x) based on a first look-up table and the determined integer n. Also, the processor determines a value for z(x) based on n and the input variable x, and based on a second look-up table. Further, the processor computes ƒ(x) based on the determined values for y(x) and z(x).

Patent Agency Ranking