-
公开(公告)号:US11960421B2
公开(公告)日:2024-04-16
申请号:US17216476
申请日:2021-03-29
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Baoqing Liu , Hu Liu , Qinglong Chen
Abstract: The present disclosure discloses example operation accelerators and compression methods. One example operation accelerator performs operations, including storing, in a first buffer, first input data. In a second buffer, weight data can be stored. A computation result is obtained by performing matrix multiplication on the first input data and the weight data by an operation circuit connected to the input buffer and the weight buffer. The computation result is compressed by a compression module to obtain compressed data. The compressed data can be stored into a memory outside the operation accelerator by a direct memory access controller (DMAC) connected to the compression module.
-
公开(公告)号:US11934481B2
公开(公告)日:2024-03-19
申请号:US17725492
申请日:2022-04-20
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Hu Liu , Heng Liao , Jiajin Tu , Honghui Yuan , Hou Fun Lam , Fan Zhu
IPC: G06F17/16
CPC classification number: G06F17/16
Abstract: Embodiments of the present invention disclose a matrix multiplier, and relate to the field of data computing technologies, so as to divide two matrices into blocks for computation. The matrix multiplier includes: a first memory, a second memory, an operation circuit, and a controller, where the operation circuit, the first memory, and the second memory may perform data communication by using a bus; and the controller is configured to control, according to a preset program or instruction, a first matrix and a second matrix to be divided into blocks, and control the operation circuit to perform a multiplication operation on corresponding blocks in the first memory and the second memory based on block division results of the controller. The matrix multiplier may be configured to perform a multiplication operation on two matrices.
-
公开(公告)号:US11334648B2
公开(公告)日:2022-05-17
申请号:US16915915
申请日:2020-06-29
Applicant: HUAWEI TECHNOLOGIES CO.,LTD.
Inventor: Hu Liu , Heng Liao , Jiajin Tu , Honghui Yuan , Hou Fun Lam , Fan Zhu
IPC: G06F17/16
Abstract: Embodiments of the present invention disclose a matrix multiplier, and relate to the field of data computing technologies, so as to divide two matrices into blocks for computation. The matrix multiplier includes: a first memory, a second memory, an operation circuit, and a controller, where the operation circuit, the first memory, and the second memory may perform data communication by using a bus; and the controller is configured to control, according to a preset program or instruction, a first matrix and a second matrix to be divided into blocks, and control the operation circuit to perform a multiplication operation on corresponding blocks in the first memory and the second memory based on block division results of the controller. The matrix multiplier may be configured to perform a multiplication operation on two matrices.
-
公开(公告)号:US20220114235A1
公开(公告)日:2022-04-14
申请号:US17560472
申请日:2021-12-23
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Zhenjiang Dong , CHIO IN IEONG , Hu Liu , Hai Chen
Abstract: A matrix processing method performed by a graphics processing unit (GPU) includes: determining a plurality of non-zero elements in a to-be-processed matrix at a processor in the GPU; generating a distribution matrix of the to-be-processed matrix at the processor, where the distribution matrix comprises identities for indicating positions of the plurality of non-zero elements in the to-be-processed matrix; obtaining a target matrix from another matrix by using the distribution matrix at a logic circuit in the processor, where the target matrix comprises a plurality of target elements from the another matrix; and performing matrix processing on the plurality of non-zero elements and the target matrix to obtain an operation result at the processor.
-
公开(公告)号:US10452604B2
公开(公告)日:2019-10-22
申请号:US15454014
申请日:2017-03-09
Applicant: Huawei Technologies Co., Ltd.
Inventor: Jun Liang , Hu Liu , Zhiqiang Zhang
IPC: G06F15/167 , G06F13/16
Abstract: Embodiments of the present disclosure provide a method and bus for accessing a dynamic random access memory (DRAM). The embodiments include receiving an access instruction, where the access instruction includes an access address, the access address includes a physical address, and a first field and a second field that are additionally set, the first field is used to indicate an interleaving mode, the interleaving mode indicates a manner of selecting an access channel, the second field is used to indicate an interleaving granularity, and the interleaving granularity indicates a capacity of an address space corresponding to the access channel; determining, according to the first field and the second field, the access channel and an address corresponding to the access channel; and accessing the DRAM according to the access channel and the address corresponding to the access channel.
-
公开(公告)号:US20250086020A1
公开(公告)日:2025-03-13
申请号:US18961393
申请日:2024-11-26
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Bo Fang , Hu Liu , Hou Fun Lam , Zipei Su
IPC: G06F9/50
Abstract: A multi-core processor and a related inter-core communication method are provided. The multi-core processor includes an inter-core communication module and a plurality of processor cores. The plurality of processor cores include N first processor cores. Each of the N first processor cores is configured to: execute a first task to generate operation information, where the operation information includes a completion identifier of the first task, and one or more of a processor core identifier of the first processor core, an inter-core synchronization mode, or association information of the first task; and send the operation information to the inter-core communication module. The inter-core communication module is configured to: determine M second processor cores from the plurality of processor cores based on N pieces of operation information, and separately send the completion identifier to the M second processor cores. Inter-core communication can be performed more efficiently and cost-effectively.
-
公开(公告)号:US11823303B2
公开(公告)日:2023-11-21
申请号:US16932768
申请日:2020-07-19
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Luping Cui , Jiajin Tu , Hu Liu , Honghui Yuan , Heng Liao , Hou Fun Lam , Bing Li
CPC classification number: G06T1/20 , G06F17/16 , G06N3/02 , G06V10/454 , G06V10/82
Abstract: A data processing method and apparatus are disclosed. In various embodiments, R groups of proposal region sequences are obtained. Each group of proposal region sequence includes a plurality of proposal regions. In those embodiments, a VRPAC instruction is invoked to calculate an area of each proposal region in each group of proposal region sequence. For a jth group of proposal region sequence in the R groups of proposal region sequences, a VIOU instruction and a VAADD instruction are invoked to determine j suppression matrices of the jth group of proposal region sequence and determine a suppression vector of the jth group of proposal region sequence based on the j suppression matrices. In those embodiments, an unsuppressed proposal region is determined based on a suppression vector of each group of proposal region sequence.
-
公开(公告)号:US11734386B2
公开(公告)日:2023-08-22
申请号:US17560472
申请日:2021-12-23
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Zhenjiang Dong , Chio In Ieong , Hu Liu , Hai Chen
Abstract: A matrix processing method performed by a graphics processing unit (GPU) includes: determining a plurality of non-zero elements in a to-be-processed matrix at a processor in the GPU; generating a distribution matrix of the to-be-processed matrix at the processor, where the distribution matrix comprises identities for indicating positions of the plurality of non-zero elements in the to-be-processed matrix; obtaining a target matrix from another matrix by using the distribution matrix at a logic circuit in the processor, where the target matrix comprises a plurality of target elements from the another matrix; and performing matrix processing on the plurality of non-zero elements and the target matrix to obtain an operation result at the processor.
-
公开(公告)号:US11321423B2
公开(公告)日:2022-05-03
申请号:US16736427
申请日:2020-01-07
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Abstract: The present application is in the field of data calculation technologies, and discloses an operation accelerator, to reduce time for performing a multiplication operation on two N*N matrices. The operation accelerator includes: a first memory, a second memory, an operation circuit, and a controller. The operation circuit performs data communication with the first memory and the second memory by using a bus. The operation circuit is configured to: extract matrix data from the first memory and the second memory, and perform a multiplication operation. The controller is configured to control, according to a preset program or instruction, the operation circuit to complete the multiplication operation. The operation accelerator is configured to perform a multiplication operation on two matrices.
-
公开(公告)号:US20210224125A1
公开(公告)日:2021-07-22
申请号:US17224643
申请日:2021-04-07
Applicant: Huawei Technologies Co., Ltd.
Abstract: An operation accelerator, a processing method, and a related device, the operation accelerator including a first memory configured to store an input dataset, a matrix converter configured to perform reading M row vectors from the input dataset, generating a first instruction, and sending the M row vectors and the first instruction to a second memory configured to perform, according to the first instruction, preprocessing on the M row vectors to obtain n row vectors, and storing the n row vectors, where the n row vectors include the M row vectors and (n-M) padding row vectors, the n row vectors are N row vectors in a target matrix, and a storage sequence of the n row vectors in the second memory is consistent with a sequence of the N row vectors in the target matrix.
-
-
-
-
-
-
-
-
-