-
公开(公告)号:US20230085718A1
公开(公告)日:2023-03-23
申请号:US18070054
申请日:2022-11-28
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Honghui YUAN , Shucheng LI , Lejin XIONG , Chernega NIKITA
IPC: G06N3/063
Abstract: A neural network scheduling method and apparatus are provided One example method includes: determining a first batch size corresponding to each layer of one or more layers in a neural network; forming, through grouping based on the first batch size, the neural network into a neural network including at least one first layer group forming, through grouping based on a grouping result of the first layer group, the neural network into a neural network including at least one second layer group and scheduling the neural network based on a grouping result of the second layer group.
-
公开(公告)号:US20230236891A1
公开(公告)日:2023-07-27
申请号:US18191134
申请日:2023-03-28
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Chen XIN , Honghui YUAN , Chun Hang LEE
CPC classification number: G06F9/5027 , G06F17/144 , G06F7/523 , G06F7/78
Abstract: A neural network accelerator is provided, including: a preprocessing module (301), configured to perform first forward winograd transform on a target matrix corresponding to an input feature map, to obtain a transformed target matrix, where the preprocessing module (301) is further configured to perform second forward winograd transform on a convolution kernel, to obtain a transformed convolution kernel; a matrix operation module (302), configured to perform a matrix multiplication operation on a first matrix and a second matrix, to obtain a multiplication result, where the first matrix is constructed based on the transformed target matrix, and the second matrix is constructed based on the transformed convolution kernel; and a vector operation module (303), configured to perform inverse winograd transform on the multiplication result, to obtain an output feature map.
-
公开(公告)号:US20190317732A1
公开(公告)日:2019-10-17
申请号:US16456119
申请日:2019-06-28
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Bin XU , Honghui YUAN , Leijun HE
Abstract: A convolution operation chip (300) and a communications device are provided. The convolution operation chip (300) includes: an M×N multiplication accumulator array (320), including a first multiplication accumulation window, where a processing element PEX,Y of the first multiplication accumulation window is configured to: perform a multiplication operation on convolutional data of the PEX,Y and a convolutional parameter of the PEX,Y, transmit the convolutional parameter of the PEX,Y to a PEX,Y+1, transmit the convolutional data of the PEX,Y to a PEX−1,Y+1, and respectively use the convolutional parameter of the PEX,Y and the convolutional data of the PEX,Y as multipliers of multiplication operations performed by the PEX,Y+1 and the PEX−1,Y+1; a data cache module (310), configured to transmit convolutional data and a convolutional parameter to the first multiplication accumulation window; and an output control module (330), configured to output a convolutional result.
-
公开(公告)号:US20180083864A1
公开(公告)日:2018-03-22
申请号:US15824032
申请日:2017-11-28
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Xianbo CHEN , Honghui YUAN , Binbin YAO
IPC: H04L12/725 , H04L12/715 , H04L12/741 , G06F9/48
CPC classification number: H04L45/306 , G06F9/4881 , G06F9/546 , H04L45/54 , H04L45/64
Abstract: A data processing method is disclosed, the method includes: receiving a request message that is sent from a host service layer and transparently transmitted through a host driver layer, where the request message includes at least one acceleration type identifier and to-be-acceleratedly-processed service data, and each acceleration type identifier corresponds to one type of accelerated processing; and performing at least one type of accelerated processing in a one-to-one correspondence with the at least one acceleration type identifier on the service data. In the method, interaction between the host service layer and the hardware processing unit does not need coordination of a specialized driver, so that dependence on a specific underlying driver for a service layer may be shielded.
-
-
-