Patent search ap:("QUALCOMM Incorporated") AND inv:"Balaji Calidas" Page 1

1.

发明授权
Methods and apparatus for tensor object support in machine learning workloads 有权

公开(公告)号：US11481865B2

公开(公告)日：2022-10-25

申请号：US17173643

申请日：2021-02-11

Applicant: QUALCOMM Incorporated

Inventor： Elina Kamenetskaya , Liang Li , Andrew Evan Gruber , Jeffrey Leger , Balaji Calidas , Ruihao Zhang

IPC: G06T1/60

Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may modify at least one texture memory object to support a data structure for one or more tensor objects. The apparatus may also determine one or more supported memory layouts for the one or more tensor objects based on the modified at least one texture memory object. Additionally, the apparatus may access data associated with the one or more tensor objects based on the one or more supported memory layouts, the data for each of the one or more tensor objects corresponding to at least one data instruction. The apparatus may also execute the at least one data instruction based on the accessed data associated with the one or more tensor objects.

2.

发明授权
Methods and apparatus to facilitate improving processing of machine learning primitives 有权

公开(公告)号：US11263064B2

公开(公告)日：2022-03-01

申请号：US16730243

申请日：2019-12-30

Applicant: QUALCOMM Incorporated

Inventor： Hitendra Gangani , Balaji Calidas , Jeremy Williams

IPC: G06F9/44 , G06F9/54 , G06F9/48 , G06N20/00 , G06K9/62 , G06T15/00 , G06F9/38

Abstract: The present disclosure relates to methods and apparatus for machine learning processing. For example, disclosed techniques facilitate improving execution of machine learning primitives. Aspects of the present disclosure may store a command stream generated by an application in a buffer, the command stream including a plurality of machine learning primitives for execution by a graphics processor. Further, aspects of the present disclosure identify, after receiving a request from the application to finalize the buffer, two or more machine learning primitives of the buffer that may be replaced with a fused shader kernel. Additionally, aspects of the present disclosure may store the fused shader kernel in the buffer to generate a fused command buffer.

3.

发明授权
Adaptive dispatch for acceleration of deep neural networks on graphic processing units 有权

公开(公告)号：US11145024B2

公开(公告)日：2021-10-12

申请号：US16728591

申请日：2019-12-27

Applicant: QUALCOMM Incorporated

Inventor： Balaji Calidas , Joshua Walter Kelly , Avinash Seetharamaiah , Jonnala Gadda Nagendra Kumar , Hitendra Mohan Gangani

IPC: G06T1/20 , G06T15/00 , G06N20/00 , G06F9/38 , G06N3/08 , G06F9/48

Abstract: Methods, systems, and devices for processing are described. A device may parse a set of layers of a deep neural network. The set of layers may be associated with a set of machine learning operations of the deep neural network. The device may determine one or more layer parameters based on the determined set of layers. In some aspects, the device may determine an execution time associated with executing a shader dispatch based on the one or more layer parameters. The device may batch the shader dispatch to a command buffer based on the execution time and process the command buffer based on the batching. The device may determine a target execution time based on an assembly time associated with the command buffer, a processing time associated with the command buffer, a frequency level associated with processing the command buffer, the one or more layer parameters, or some combination thereof.

Patent Agency Ranking