Patent search ap:("QUALCOMM INCORPORATED") AND inv:"Andrew Evan Gruber" Page 5

41.

发明申请
UTILIZING PIPELINE REGISTERS AS INTERMEDIATE STORAGE 有权
Title translation: 使用管道注册器作为中间存储

公开(公告)号：US20150324196A1

公开(公告)日：2015-11-12

申请号：US14275047

申请日：2014-05-12

Applicant: QUALCOMM Incorporated

Inventor： Lin Chen , Yun Du , Sumesh Udayakumaran , Chihong Zhang , Andrew Evan Gruber

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3012 , G06F9/30032 , G06F9/3017 , G06F9/3869 , G06F9/3875

Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.

Abstract translation: 在一个示例中，一种方法包括响应于由处理单元接收一个或多个请求将第一值从第一通用寄存器（GPR）移动到第三GPR的指令，并且第二值从第二个 GPR到第四个GPR，由初始逻辑单元和在第一时钟周期期间将第一个值复制到初始流水线寄存器，通过初始逻辑复制第二个时钟周期，将第二个值复制到初始流水线寄存器，由最终逻辑单元和在第三时钟周期期间将第一值从最终流水线寄存器复制到第三GPR，并且由最终逻辑单元复制并在第四时钟周期期间从最终管道复制第二值注册到第四个GPR。

42.

发明申请
TECHNIQUES FOR SERIALIZED EXECUTION IN A SIMD PROCESSING SYSTEM 审中-公开
Title translation: SIMD处理系统中串行执行的技术

公开(公告)号：US20150317157A1

公开(公告)日：2015-11-05

申请号：US14268215

申请日：2014-05-02

Applicant: QUALCOMM Incorporated

Inventor： Andrew Evan Gruber , Lin Chen , Yun Du , Alexei Vladimirovich Bourd

IPC: G06F9/30

CPC classification number: G06F9/3851 , G06F9/3887

Abstract: A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation.

Abstract translation: SIMD处理器可以被配置为从多个线程确定一个或多个活动线程，从一个或多个活动线程中选择一个活动线程，并对所选择的活动线程执行发散操作。发散操作可以是串行操作。

43.

发明授权
Methods and apparatus to facilitate a dedicated bindless state processor 有权

公开(公告)号：US12056790B2

公开(公告)日：2024-08-06

申请号：US17758219

申请日：2020-01-31

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Andrew Evan Gruber , Chun Yu , Chihong Zhang , Thomas Edwin Frisinger , Richard Hammerstone , Zilin Ying , Heng Qi , Quanquan Xu , Sheng Gu

IPC: G06T1/60

CPC classification number: G06T1/60

Abstract: The present disclosure relates to methods and apparatus for graphics processing. For example, disclosed techniques facilitate improving bindless state processing at a graphics processor. Aspects of the present disclosure can receive, at a graphics processor, a shader program including a preamble section and a main instructions section. Aspects of the present disclosure can also execute, with a scalar processor dedicated to processing preamble sections, instructions of the preamble section to implement a bindless mechanism for loading constant data associated with the shader program. Additionally, aspects of the present disclosure can distribute the main instructions section and the constant data to a streaming processor for executing the shader program.

44.

发明授权
Dynamic wave pairing 有权

公开(公告)号：US11954758B2

公开(公告)日：2024-04-09

申请号：US17652478

申请日：2022-02-24

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Andrew Evan Gruber , Zilin Ying , Chunling Hu , Baoguang Yang , Yang Xia , Gang Zhong , Chun Yu , Eric Demers

IPC: G06T1/20 , G06F9/50

CPC classification number: G06T1/20 , G06F9/505

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.

45.

发明授权
GPU wave-to-wave optimization 有权

公开(公告)号：US11928754B2

公开(公告)日：2024-03-12

申请号：US17658433

申请日：2022-04-07

Applicant: QUALCOMM Incorporated

Inventor： Andrew Evan Gruber

IPC: G06T1/20 , G06T15/00 , G06T15/80

CPC classification number: G06T1/20 , G06T15/005 , G06T15/80

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for GPU wave-to-wave optimization. A graphics processor may execute a shader program for a first wave associated with a draw call or a compute kernel. The graphics processor may identify at least one first indication for the first wave associated with the draw call or the compute kernel. The graphics processor may store the at least one first indication for the first wave to a memory location. The graphics processor may execute the shader program for at least one second wave associated with the draw call or the compute kernel. The execution of the shader program for the at least one second wave may be based on the shader program for the at least one second wave reading the memory location to retrieve the at least one first indication.

46.

发明授权
Methods and apparatus to perform matrix multiplication in a streaming processor 有权

公开(公告)号：US11829439B2

公开(公告)日：2023-11-28

申请号：US17137226

申请日：2020-12-29

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Gang Zhong , Fei Wei , Yibin Zhang , Jing Han , Hongjiang Shang , Elina Kamenetskaya , Minjie Huang , Alexei Vladimirovich Bourd , Chun Yu , Andrew Evan Gruber , Eric Demers

IPC: G06F17/16 , G06F7/57

CPC classification number: G06F17/16 , G06F7/57

Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.

47.

发明授权
Configurable apron support for expanded-binning 有权

公开(公告)号：US11682109B2

公开(公告)日：2023-06-20

申请号：US17073218

申请日：2020-10-16

Applicant: QUALCOMM Incorporated

Inventor： Kalyan Kumar Bhiravabhatla , Krishnaiah Gummidipudi , Ankit Kumar Singh , Andrew Evan Gruber , Pavan Kumar Akkaraju , Srihari Babu Alla , Jonnala Gadda Nagendra Kumar , Vishwanath Shashikant Nikam

IPC: G06T5/40 , G06T7/13 , G06T1/20 , G06T5/20 , G06T15/00 , G06T15/40

CPC classification number: G06T5/40 , G06T1/20 , G06T5/20 , G06T7/13 , G06T15/005 , G06T15/40

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for configurable aprons for expanded binning. Aspects of the present disclosure include identifying one or more pixel tiles in at least one bin and determining edge information for each pixel tile of the one or more pixel tiles. The edge information may be associated with one or more pixels adjacent to each pixel tile. The present disclosure further describes determining whether at least one adjacent bin is visible based on the edge information for each pixel tile, where the at least one adjacent bin may be adjacent to the at least one bin.

48.

发明授权
Bin filtering 有权

公开(公告)号：US11600002B2

公开(公告)日：2023-03-07

申请号：US16892096

申请日：2020-06-03

Applicant: QUALCOMM Incorporated

Inventor： Jian Liang , Andrew Evan Gruber , Tao Wang , Srihari Babu Alla , Kalyan Kumar Bhiravabhatla , Jonnala Gadda Nagendra Kumar , William Licea-Kane , Fredrick Alan Hickman

IPC: G06T7/11 , G06T1/60 , G06T1/20 , G06T7/136 , G06T5/40

Abstract: Methods, systems, and devices for graphics processing are described. A device may receive an image including a set of pixels. The device may render a first subset of pixels in each bin of a set of bins during a first rendering pass, and defer rendering a second subset of pixels and a third subset of pixels in each bin of the set of bins during the first rendering pass. The second subset of pixels may include edge pixels and the third subset of pixels may be between the first subset of pixels and the second subset of pixels. The device may render the second subset of pixels and the third subset of pixels in each bin of the set of bins during a second rendering pass based on rendering the first subset of pixels. The device may then output the image based on the first and second rendering pass.

49.

发明授权
Deferred GPR allocation for texture/load instruction block 有权

公开(公告)号：US11204765B1

公开(公告)日：2021-12-21

申请号：US17003600

申请日：2020-08-26

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Fei Wei , Gang Zhong , Minjie Huang , Jian Jiang , Zilin Ying , Baoguang Yang , Yang Xia , Jing Han , Liangxiao Hu , Chihong Zhang , Chun Yu , Andrew Evan Gruber , Eric Demers

IPC: G06F9/30 , G06F9/38 , G06T1/20 , G06F9/50

Abstract: A graphics processing unit (GPU) utilizes block general purpose registers (bGPRs) to load multiple waves of samples for an instruction group into a processing pipeline and receive processed samples from the pipeline. The GPU acquires a credit for the bGPR for execution of the instruction group for a first wave using a persistent GPR and the bGPR. The GPU refunds the credit upon loading the first wave into the pipeline. The GPU executes a subsequent wave for the instruction group to load samples to the pipeline when at least one credit is available and the pipeline is processing the first wave. The GPU stores an indication of each wave that has been loaded into the pipeline in a queue. The GPU returns samples for a next wave in the queue from the pipeline to the bGPR for further processing when the physical slot of the bGPR is available.

50.

发明授权
Graphics instruction operands alias 有权

公开(公告)号：US11132760B2

公开(公告)日：2021-09-28

申请号：US16714052

申请日：2019-12-13

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Andrew Evan Gruber , Chihong Zhang , Gang Zhong , Jian Jiang , Fei Wei , Minjie Huang , Zilin Ying , Yang Xia , Jing Han , Chun Yu , Eric Demers

IPC: G06T1/20 , G06F9/30 , G06F9/50 , G06F9/38 , G06F1/03

Abstract: Methods, systems, and devices for graphic processing are described. The methods, systems, and devices may include or be associated with identifying a graphics instruction, determining that the graphics instruction is alias enabled for the device, partitioning an alias lookup table into one or more slots, allocating a slot of the alias lookup table based on the partitioning and determining that the graphics instruction is alias enabled, generating an alias instruction based on allocating the slot of the alias lookup table and determining that the graphics instruction is alias enabled, and processing the alias instruction.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification