Weight data storage method and neural network processor based on the method

    公开(公告)号:US11531889B2

    公开(公告)日:2022-12-20

    申请号:US16762810

    申请日:2018-02-28

    Abstract: Disclosed are a weight data storage method and a convolution computation method that may be implemented in a neural network. The weight data storage method comprises searching for effective weights in a weight convolution kernel matrix and acquiring an index of effective weights. The effective weights are non-zero weights, and the index of effective weights is used to mark the position of the effective weights in the weight convolution kernel matrix. The weight data storage method further comprises storing the effective weights and the index of effective weights. According to the weight data storage method and the convolution computation method of the present disclosure, storage space can be saved, and computation efficiency can be improved.

    FRACTAL TREE STRUCTURE-BASED DATA TRANSMIT DEVICE AND METHOD, CONTROL DEVICE, AND INTELLIGENT CHIP

    公开(公告)号:US20210075639A1

    公开(公告)日:2021-03-11

    申请号:US17100570

    申请日:2020-11-20

    Abstract: The present invention provides a fractal tree structure-based data transmit device and method, a control device, and an intelligent chip. The device comprises: a central node that is as a communication data center of a network-on-chip and used for broadcasting or multicasting communication data to a plurality of leaf nodes; the plurality of leaf nodes that are as communication data nodes of the network-on-chip and for transmitting the communication data to a central leaf node; and forwarder modules for connecting the central node with the plurality of leaf nodes and forwarding the communication data; the central node, the forwarder modules and the plurality of leaf nodes are connected in the fractal tree network structure, and the central node is directly connected to M the forwarder modules and/or leaf nodes, any the forwarder module is directly connected to M the next level forwarder modules and/or leaf nodes.

    Fractal tree structure-based data transmit device and method, control device, and intelligent chip

    公开(公告)号:US10904034B2

    公开(公告)日:2021-01-26

    申请号:US15781608

    申请日:2016-06-17

    Abstract: One example of a device comprises: a central node that is as a communication data center of a network-on-chip; a plurality of leaf nodes that are as communication data nodes of the network-on-chip and for transmitting the communication data to a central leaf node; forwarder modules for connecting the central node with the plurality of leaf nodes and forwarding the communication data, wherein the plurality of leaf nodes are divided into N groups, each group having the same number of leaf nodes, the central node is individually in communication connection with each group of leaf nodes by means of the forwarder module, a communication structure constituted by each group of leaf nodes has self-similarity, and the plurality of leaf nodes are in communication connection with the central node in a complete multi-way tree approach by means of the forwarder modules of multiple levels.

    FRACTAL-TREE COMMUNICATION STRUCTURE AND METHOD, CONTROL APPARATUS AND INTELLIGENT CHIP

    公开(公告)号:US20180375789A1

    公开(公告)日:2018-12-27

    申请号:US15781686

    申请日:2016-06-17

    Abstract: A communication structure comprises: a central node that is a communication data center of a network-on-chip and used for broadcasting or multicasting communication data to a plurality of leaf nodes; a plurality of leaf nodes that are communication data nodes of the network-on-chip and used for transmitting the communication data to the central node; and forwarder modules for connecting the central node with the plurality of leaf nodes and forwarding the communication data, wherein the plurality of leaf nodes are divided into N groups, each group having the same number of leaf nodes, the central node is individually in communication connection with each group of leaf nodes by means of the forwarder modules, the communication structure is a fractal-tree structure, the communication structure constituted by each group of leaf nodes has self-similarity, and the forwarder modules comprises a central forwarder module, leaf forwarder modules, and intermediate forwarder modules.

    SPLIT ACCUMULATOR FOR CONVOLUTIONAL NEURAL NETWORK ACCELERATOR

    公开(公告)号:US20210357735A1

    公开(公告)日:2021-11-18

    申请号:US17250890

    申请日:2019-05-21

    Abstract: Disclosed embodiments relate to a split accumulator for a convolutional neural network accelerator, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.

    CONVOLUTIONAL NEURAL NETWORK ACCELERATOR

    公开(公告)号:US20210350204A1

    公开(公告)日:2021-11-11

    申请号:US17250889

    申请日:2019-05-21

    Abstract: Disclosed embodiments relate to a convolutional neural network accelerator, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.

    NEURAL NETWORK ACCELERATOR AND OPERATION METHOD THEREOF

    公开(公告)号:US20190026626A1

    公开(公告)日:2019-01-24

    申请号:US16071801

    申请日:2016-08-09

    Abstract: A neural network accelerator and an operation method thereof applicable in the field of neural network algorithms are disclosed. The neural network accelerator comprises an on-chip storage medium for storing data externally transmitted or for storing data generated during computing; an on-chip address index module for mapping to a correct storage address on the basis of an input index when an operation is performed; a core computing module for performing a neural network operation; and a multi-ALU device for obtaining input data from the core computing module or the on-chip storage medium to perform a nonlinear operation which cannot be completed by the core computing module. By introducing a multi -ALU design into the neural network accelerator, an operation speed of the nonlinear operation is increased, such that the neural network accelerator is more efficient.

    METHOD AND DEVICE FOR ON-CHIP REPETITIVE ADDRESSING

    公开(公告)号:US20190018766A1

    公开(公告)日:2019-01-17

    申请号:US16070735

    申请日:2016-08-09

    Abstract: The present disclosure may include a method that comprises: partitioning data on an on-chip and/or an off-chip storage medium into different data blocks according to a pre-determined data partitioning principle, wherein data with a reuse distance less than a pre-determined distance threshold value is partitioned into the same data block; and a data indexing step for successively loading different data blocks to at least one on-chip processing unit according a pre-determined ordinal relation of a replacement policy, wherein the repeated data in a loaded data block being subjected to on-chip repetitive addressing. Data with a reuse distance less than a pre-determined distance threshold value is partitioned into the same data block, and the data partitioned into the same data block can be loaded on a chip once for storage, and is then used as many times as possible, so that the access is more efficient.

Patent Agency Ranking