-
公开(公告)号:US20200311859A1
公开(公告)日:2020-10-01
申请号:US16368782
申请日:2019-03-28
Applicant: QUALCOMM Incorporated
Inventor: Yun DU , Nigel POOLE , Zilin YING , Ling Feng HUANG , Donghyun KIM , Chun YU , Tzun-Wei LEE , Xuefeng TANG , Shambhoo KHANDELWAL , Hongjiang SHANG , Elina KAMENETSKAYA , Zhu LIANG , Cary ROBINS
Abstract: The present disclosure relates to methods and apparatus for graphics processing. In some aspects, multiple processing units can be in a graphics processing pipeline of a GPU. The apparatus can also group the multiple processing units into one or more processing unit clusters. In some aspects, each of the one or more processing unit clusters can correspond to one or more context registers. Additionally, the apparatus can determine one or more context states of the one or more context registers in each of the one or more processing unit clusters. Also, the apparatus can implement one or more execution counters corresponding to at least one of the one or more processing unit clusters in the graphics processing pipeline, where each of the one or more execution counters includes an execution value.
-
公开(公告)号:US20240037183A1
公开(公告)日:2024-02-01
申请号:US18487918
申请日:2023-10-16
Applicant: QUALCOMM Incorporated
Inventor: Yun DU , Gang ZHONG , Fei WEI , Yibin ZHANG , Jing HAN , Hongjiang SHANG , Elina KAMENETSKAYA , Minjie HUANG , Alexei Vladimirovich BOURD , Chun YU , Andrew Evan GRUBER , Eric DEMERS
Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.
-
公开(公告)号:US20210200836A1
公开(公告)日:2021-07-01
申请号:US17137226
申请日:2020-12-29
Applicant: QUALCOMM Incorporated
Inventor: Yun DU , Gang ZHONG , Fei WEI , Yibin ZHANG , Jing HAN , Hongjiang SHANG , Elina KAMENETSKAYA , Minjie HUANG , Alexei Vladimirovich BOURD , Chun YU , Andrew Evan GRUBER , Eric DEMERS
Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.
-
公开(公告)号:US20200312006A1
公开(公告)日:2020-10-01
申请号:US16364829
申请日:2019-03-26
Applicant: QUALCOMM Incorporated
Inventor: Yun DU , Andrew Evan GRUBER , Chun YU , Chihong ZHANG , Hongjiang SHANG , Zilin YING , Fei WEI
Abstract: Example techniques are described for generating graphics content by obtaining texture operation instructions corresponding to a texture operation, in response to determining at least one of insufficient general purpose register space is available for the texture operation or insufficient wave slots are available for the texture operation, generating an indication that the texture operation corresponds to a deferred wave, executing the texture operation, sending, to a texture processor, initial texture sample instructions corresponding to the texture operation that was executed, and receiving texture mapped data corresponding to the initial texture sample instructions.
-
-
-