NEURAL NETWORK SCHEDULING METHOD AND APPARATUS

    公开(公告)号:US20230085718A1

    公开(公告)日:2023-03-23

    申请号:US18070054

    申请日:2022-11-28

    Abstract: A neural network scheduling method and apparatus are provided One example method includes: determining a first batch size corresponding to each layer of one or more layers in a neural network; forming, through grouping based on the first batch size, the neural network into a neural network including at least one first layer group forming, through grouping based on a grouping result of the first layer group, the neural network into a neural network including at least one second layer group and scheduling the neural network based on a grouping result of the second layer group.

    NEURAL NETWORK ACCELERATOR, ACCELERATION METHOD, AND APPARATUS

    公开(公告)号:US20230236891A1

    公开(公告)日:2023-07-27

    申请号:US18191134

    申请日:2023-03-28

    CPC classification number: G06F9/5027 G06F17/144 G06F7/523 G06F7/78

    Abstract: A neural network accelerator is provided, including: a preprocessing module (301), configured to perform first forward winograd transform on a target matrix corresponding to an input feature map, to obtain a transformed target matrix, where the preprocessing module (301) is further configured to perform second forward winograd transform on a convolution kernel, to obtain a transformed convolution kernel; a matrix operation module (302), configured to perform a matrix multiplication operation on a first matrix and a second matrix, to obtain a multiplication result, where the first matrix is constructed based on the transformed target matrix, and the second matrix is constructed based on the transformed convolution kernel; and a vector operation module (303), configured to perform inverse winograd transform on the multiplication result, to obtain an output feature map.

    Convolution Operation Chip And Communications Device

    公开(公告)号:US20190317732A1

    公开(公告)日:2019-10-17

    申请号:US16456119

    申请日:2019-06-28

    Abstract: A convolution operation chip (300) and a communications device are provided. The convolution operation chip (300) includes: an M×N multiplication accumulator array (320), including a first multiplication accumulation window, where a processing element PEX,Y of the first multiplication accumulation window is configured to: perform a multiplication operation on convolutional data of the PEX,Y and a convolutional parameter of the PEX,Y, transmit the convolutional parameter of the PEX,Y to a PEX,Y+1, transmit the convolutional data of the PEX,Y to a PEX−1,Y+1, and respectively use the convolutional parameter of the PEX,Y and the convolutional data of the PEX,Y as multipliers of multiplication operations performed by the PEX,Y+1 and the PEX−1,Y+1; a data cache module (310), configured to transmit convolutional data and a convolutional parameter to the first multiplication accumulation window; and an output control module (330), configured to output a convolutional result.

    DATA PROCESSING METHOD AND APPARATUS
    4.
    发明申请

    公开(公告)号:US20180083864A1

    公开(公告)日:2018-03-22

    申请号:US15824032

    申请日:2017-11-28

    CPC classification number: H04L45/306 G06F9/4881 G06F9/546 H04L45/54 H04L45/64

    Abstract: A data processing method is disclosed, the method includes: receiving a request message that is sent from a host service layer and transparently transmitted through a host driver layer, where the request message includes at least one acceleration type identifier and to-be-acceleratedly-processed service data, and each acceleration type identifier corresponds to one type of accelerated processing; and performing at least one type of accelerated processing in a one-to-one correspondence with the at least one acceleration type identifier on the service data. In the method, interaction between the host service layer and the hardware processing unit does not need coordination of a specialized driver, so that dependence on a specific underlying driver for a service layer may be shielded.

Patent Agency Ranking