Patent search ap:("QUALCOMM Incorporated") AND inv:"Gang Zhong" Page 1

1.

发明授权
Removal of degenerated sub-primitives in tessellation 有权

公开(公告)号：US10580209B2

公开(公告)日：2020-03-03

申请号：US15913480

申请日：2018-03-06

Applicant: QUALCOMM Incorporated

Inventor： Li Shen , Gang Zhong , Yan Li

IPC: G06T17/20 , G06T17/10

Abstract: This disclosure describes a method and apparatus for performing tessellation in a graphics process. A graphics processing unit may be configured to determine tessellation factors for a first patch of the graphics data, determine, based on the tessellation factors, that a first edge of an outermost ring of the first patch will produce only degenerated sub-primitives, and skip performing tessellation for the first edge. A graphics processing unit may determine that a second edge of the outermost ring of the first patch will produce at least some normal sub-primitives, and perform tessellation for the second edge to produce output primitives.

2.

发明授权
Stereoscopic view processing 有权

公开(公告)号：US10186008B2

公开(公告)日：2019-01-22

申请号：US14855962

申请日：2015-09-16

Applicant: QUALCOMM Incorporated

Inventor： Gang Zhong , Vineet Goel , Young In Yeo , Juraj Obert

IPC: G06T15/00 , G06T1/20 , G06T15/80 , H04N13/106 , H04N13/139 , H04N13/261 , H04N13/275 , H04N13/30

Abstract: Techniques are described for stereoscopic view generation. A graphics processing unit (GPU) may combine attribute information for two or more corresponding vertices of corresponding primitives in different views. The GPU may process the combined attributed information to generate graphics data for the stereoscopic view.

3.

发明授权
Runtime mechanism to optimize shader execution flow 有权

公开(公告)号：US12229864B2

公开(公告)日：2025-02-18

申请号：US17817815

申请日：2022-08-05

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Eric Demers , Andrew Evan Gruber , Chun Yu , Baoguang Yang , Chihong Zhang , Yuehai Du , Avinash Seetharamaiah , Jonnala Gadda Nagendra Kumar , Gang Zhong , Zilin Ying , Fei Wei

IPC: G06T15/00 , G06T15/80

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.

4.

发明授权
Run-time mechanism for optimal shader 有权

公开(公告)号：US12067666B2

公开(公告)日：2024-08-20

申请号：US17664033

申请日：2022-05-18

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Eric Demers , Andrew Evan Gruber , Chun Yu , Chihong Zhang , Baoguang Yang , Yuehai Du , Gang Zhong , Avinash Seetharamaiah , Jonnala Gadda Nagendra Kumar

IPC: G06T15/00 , G06T1/60

CPC classification number: G06T15/005 , G06T1/60

Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.

5.

发明授权
Fast incremental shared constants 有权

公开(公告)号：US11694384B2

公开(公告)日：2023-07-04

申请号：US17085272

申请日：2020-10-30

Applicant: QUALCOMM Incorporated

Inventor： Thomas Edwin Frisinger , Richard Hammerstone , Andrew Evan Gruber , Gang Zhong , Yun Du , Jonnala Gadda Nagendra Kumar

IPC: G06T15/00 , G06F9/30 , G06T1/20 , G06T1/60 , G06T15/80

CPC classification number: G06T15/005 , G06F9/30101 , G06F9/30123 , G06T1/20 , G06T1/60 , G06T15/80

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for fast incremental shared constants. In aspects, a CPU may determine/update shared constant data for a first draw call of a plurality of draw calls. The shared constant data, which may correspond to at least one shader, may be updated based on a draw call update for the first draw call. The CPU may communicate the updated shared constant data for the first draw call to a GPU. The GPU may receive, in at least one register, the updated shared constant data from the CPU and configure the at least one register based on the updated shared constant data corresponding to the draw call update of the first draw call of the plurality of draw calls.

6.

发明申请
STEREOSCOPIC VIEW PROCESSING 审中-公开
Title translation: 立体视野加工

公开(公告)号：US20160350892A1

公开(公告)日：2016-12-01

申请号：US14855962

申请日：2015-09-16

Applicant: QUALCOMM Incorporated

Inventor： Gang Zhong , Vineet Goel , Young In Yeo , Juraj Obert

IPC: G06T1/20 , G06T15/80 , H04N13/04 , G06T15/00 , H04N13/02 , H04N13/00

CPC classification number: G06T1/20 , G06T15/005 , G06T15/80 , H04N13/106 , H04N13/139 , H04N13/261 , H04N13/275 , H04N13/30

Abstract: Techniques are described for stereoscopic view generation. A graphics processing unit (GPU) may combine attribute information for two or more corresponding vertices of corresponding primitives in different views. The GPU may process the combined attributed information to generate graphics data for the stereoscopic view.

Abstract translation: 描述了用于立体视图生成的技术。图形处理单元（GPU）可以组合不同视图中对应图元的两个或多个对应顶点的属性信息。 GPU可以处理组合的归属信息以产生用于立体视图的图形数据。

7.

发明申请
SHADER PROGRAM EXECUTION TECHNIQUES FOR USE IN GRAPHICS PROCESSING 有权
Title translation: 用于图形处理的较差程序执行技术

公开(公告)号：US20160055667A1

公开(公告)日：2016-02-25

申请号：US14466554

申请日：2014-08-22

Applicant: QUALCOMM Incorporated

Inventor： Vineet Goel , Donghyun Kim , Gang Zhong

IPC: G06T15/83 , G06T15/00

CPC classification number: G06T15/83 , G06T15/005

Abstract: This disclosure describes techniques for executing shader programs in a graphics processing unit (GPU). In some examples, the techniques for executing shader programs may include executing, with a shader unit of a graphics processor, a shader program that performs vertex shader processing and that generates multiple output vertices for each input vertex that is received by the shader program. In further examples, the techniques for executing shader programs may include executing a merged vertex/geometry shader program using a non-replicated mode of execution. The non-replicated mode of execution may involve assigning each of a plurality of primitives to one merged vertex/geometry shader program instance per primitive and causing each of the instances to output a plurality of vertices. In additional examples, the techniques for executing shader programs may include techniques for selecting one of a non-replicated mode and a replicated mode for executing a merged vertex/geometry shader program.

Abstract translation: 本公开描述了用于在图形处理单元（GPU）中执行着色器程序的技术。在一些示例中，用于执行着色器程序的技术可以包括使用图形处理器的着色器单元执行着色器程序，该着色器程序执行顶点着色器处理，并且为着色器程序接收的每个输入顶点生成多个输出顶点。在另外的示例中，用于执行着色器程序的技术可以包括使用非复制的执行模式来执行合并的顶点/几何着色器程序。非复制的执行模式可以包括将多个基元中的每一个分配给每个基元的一个合并的顶点/几何着色器程序实例，并使每个实例输出多个顶点。在附加示例中，用于执行着色器程序的技术可以包括用于选择非复制模式和用于执行合并顶点/几何着色器程序的复制模式之一的技术。

8.

发明授权
Dynamic wave pairing 有权

公开(公告)号：US11954758B2

公开(公告)日：2024-04-09

申请号：US17652478

申请日：2022-02-24

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Andrew Evan Gruber , Zilin Ying , Chunling Hu , Baoguang Yang , Yang Xia , Gang Zhong , Chun Yu , Eric Demers

IPC: G06T1/20 , G06F9/50

CPC classification number: G06T1/20 , G06F9/505

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.

9.

发明授权
Methods and apparatus to perform matrix multiplication in a streaming processor 有权

公开(公告)号：US11829439B2

公开(公告)日：2023-11-28

申请号：US17137226

申请日：2020-12-29

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Gang Zhong , Fei Wei , Yibin Zhang , Jing Han , Hongjiang Shang , Elina Kamenetskaya , Minjie Huang , Alexei Vladimirovich Bourd , Chun Yu , Andrew Evan Gruber , Eric Demers

IPC: G06F17/16 , G06F7/57

CPC classification number: G06F17/16 , G06F7/57

Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.

10.

发明授权
Deferred GPR allocation for texture/load instruction block 有权

公开(公告)号：US11204765B1

公开(公告)日：2021-12-21

申请号：US17003600

申请日：2020-08-26

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Fei Wei , Gang Zhong , Minjie Huang , Jian Jiang , Zilin Ying , Baoguang Yang , Yang Xia , Jing Han , Liangxiao Hu , Chihong Zhang , Chun Yu , Andrew Evan Gruber , Eric Demers

IPC: G06F9/30 , G06F9/38 , G06T1/20 , G06F9/50

Abstract: A graphics processing unit (GPU) utilizes block general purpose registers (bGPRs) to load multiple waves of samples for an instruction group into a processing pipeline and receive processed samples from the pipeline. The GPU acquires a credit for the bGPR for execution of the instruction group for a first wave using a persistent GPR and the bGPR. The GPU refunds the credit upon loading the first wave into the pipeline. The GPU executes a subsequent wave for the instruction group to load samples to the pipeline when at least one credit is available and the pipeline is processing the first wave. The GPU stores an indication of each wave that has been loaded into the pipeline in a queue. The GPU returns samples for a next wave in the queue from the pipeline to the bGPR for further processing when the physical slot of the bGPR is available.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification