Patent search ap:("Huawei Technologies Co. Page Ltd.") AND inv:"Yufei Zhang"

1.

发明授权
Inter-warp sharing of general purpose register data in GPU 有权

公开(公告)号：US11908061B2

公开(公告)日：2024-02-20

申请号：US17463835

申请日：2021-09-01

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Zhou Hong , Yufei Zhang

IPC: G06T15/00 , G06F9/30 , G06T1/60

CPC classification number: G06T15/005 , G06F9/30101 , G06T1/60

Abstract: Methodologies and architectures are provided for inter-thread sharing of data in a general purpose register (GPR) of a multiprocessor apparatus. The data sharing is performed by a graphics processing unit (GPU) having at least one processing cluster including a plurality of processing cores (PCs) configured for parallel operation. Each PC of a cluster is configured to utilize a dedicated portion of the GPR. The GPU further includes a shared memory for the cluster, and a memory read/write hub coupled to the GPR and shared memory, the hub including a crossbar switch. A PC executes a move data instruction, including operands referencing a destination portion of the GPR and a source portion assigned to the PC, to retrieve data from the source portion. The memory read/write hub writes the data, via the crossbar switch, to the destination portion of the GPR without first writing the data to the shared memory.

2.

发明申请
Filter Independent L1 Mapping Of Convolution Data Into General Purpose Register 有权

公开(公告)号：US20210272232A1

公开(公告)日：2021-09-02

申请号：US17326913

申请日：2021-05-21

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Zhou Hong , Yufei Zhang

IPC: G06T1/60 , G06T3/00 , G06T1/20

Abstract: The disclosed technology relates to graphics processing units (GPU). In one aspect, a GPU includes a general purpose register (GPR) including registers, an arithmetic logic unit (ALU) reading pixels of an image independently of a shared memory, and a level 1 (L1) cache storing pixels to implement a pixel mapping that maps the pixels read from the L1 cache into the registers of the GPR. The pixel mapping includes separating pixels of an image into three regions, with each region including a set of pixels. A first and second set of the pixels are loaded into registers corresponding to two of the three regions horizontally, and a third set of the pixels are loaded into registers corresponding to the third of the three regions vertically. Each of the registers in the first, second, and third registers are loaded as a contiguous ordered number of registers in the GPR.

3.

发明授权
Filter independent L1 mapping of convolution data into general purpose register 有权

公开(公告)号：US12026801B2

公开(公告)日：2024-07-02

申请号：US17326913

申请日：2021-05-21

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Zhou Hong , Yufei Zhang

IPC: G06T1/60 , G06F17/15 , G06T1/20 , G06T3/18 , G06T15/00

CPC classification number: G06T1/60 , G06F17/153 , G06T1/20 , G06T3/18 , G06T15/005

Abstract: The disclosed technology relates to graphics processing units (GPU), In one aspect, a GPU includes a general purpose register (GPR) including registers, an arithmetic logic unit (ALU) reading pixels of an image independently of a shared memory, and a level 1 (L1) cache storing pixels to implement a pixel mapping that maps the pixels read from the L1 cache into the registers of the GPR. The pixel mapping includes separating pixels of an image into three regions, with each region including a set of pixels. A first and second set of the pixels are loaded into registers corresponding to two of the three regions horizontally, and a third set of the pixels are loaded into registers corresponding to the third of the three regions vertically. Each of the registers in the first, second, and third registers are loaded as a contiguous ordered number of registers in the GPR.

4.

发明授权
Storing complex data in warp GPRS 有权

公开(公告)号：US12190109B2

公开(公告)日：2025-01-07

申请号：US17486434

申请日：2021-09-27

Applicant: Huawei Technologies Co., Ltd.

Inventor： Lin Chen , Zhou Hong , Yufei Zhang

IPC: G06F9/38 , G06F9/30

Abstract: A method of storing data in general purpose registers (GPRs) includes packing a tile of data items into GPRs, where the tile includes multiple channels. The tile of data items is read from memory. At least two channels of the data are stored in a first GPR, and at least two additional channels are stored in a second GPR. Auxiliary data is loaded into a third GPR. The auxiliary data and the tile data can be used together for performing convolution operations.

5.

发明授权
Loading apparatus and method for convolution with stride or dilation of 2 有权

公开(公告)号：US11915338B2

公开(公告)日：2024-02-27

申请号：US17319301

申请日：2021-05-13

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Zhou Hong , Yufei Zhang

IPC: G06T1/20 , G06F9/46 , G06T1/60 , G06T15/00

CPC classification number: G06T1/20 , G06F9/462 , G06T1/60

Abstract: The disclosed technology generally relates to a graphics processing unit (GPU). In one aspect, a GPU includes a general purpose register (GPR) having registers, an arithmetic logic unit (ALU) configured to read pixels of an image independently of a shared memory, and a level 1 (L1) cache storing the pixels read by the ALU. The ALU can implement pixel mapping by fetching a quad of pixels, which includes pixels of first, second, third, and fourth pixel types, from the L1 cache, grouping the pixels of the different pixel types of the quad into four groups based on pixel type, and, for each group, separating the pixels included in the group into three regions that each have a set of pixels. The pixels for each group can then be loaded into the registers corresponding to the three regions.

6.

发明申请
Storing Complex Data in Warp GPRS 有权

公开(公告)号：US20220012053A1

公开(公告)日：2022-01-13

申请号：US17486434

申请日：2021-09-27

Applicant: Huawei Technologies Co., Ltd.

Inventor： Lin Chen , Zhou Hong , Yufei Zhang

IPC: G06F9/30 , G06F9/38

Abstract: A method of storing data in general purpose registers (GPRs) includes packing a tile of data items into GPRs, where the tile includes multiple channels. The tile of data items is read from memory. At least two channels of the data are stored in a first GPR, and at least two additional channels are stored in a second GPR. Auxiliary data is loaded into a third GPR. The auxiliary data and the tile data can be used together for performing convolution operations.

7.

发明申请
LOADING APPARATUS AND METHOD FOR CONVOLUTION WITH STRIDE OR DILATION OF 2 有权

公开(公告)号：US20210264560A1

公开(公告)日：2021-08-26

申请号：US17319301

申请日：2021-05-13

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Zhou Hong , Yufei Zhang

IPC: G06T1/20 , G06T1/60 , G06F9/46

Abstract: The disclosed technology generally relates to a graphics processing unit (GPU). In one aspect, a GPU includes a general purpose register (GPR) having registers, an arithmetic logic unit (ALU) configured to read pixels of an image independently of a shared memory, and a level 1 (L1) cache storing the pixels read by the ALU. The ALU can implement pixel mapping by fetching a quad of pixels, which includes pixels of first, second, third, and fourth pixel types, from the L1 cache, grouping the pixels of the different pixel types of the quad into four groups based on pixel type, and, for each group, separating the pixels included in the group into three regions that each have a set of pixels. The pixels for each group can then be loaded into the registers corresponding to the three regions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification