NEAR MEMORY SPARSE MATRIX COMPUTATION IN DEEP NEURAL NETWORK

    公开(公告)号:US20220101091A1

    公开(公告)日:2022-03-31

    申请号:US17550405

    申请日:2021-12-14

    Abstract: A DNN accelerator includes a multiplication controller controlling whether to perform matrix computation based on weight values. The multiplication controller reads a weight matrix from a WRAM in the DNN accelerator and determines a row value for a row in the weight matrix. In an embodiment where the row value is one, a first switch sends a read request to the WRAM to read weights in the row and a second switch forms a data transmission path from an IRAM in the DNN accelerator to a PE in the DNN accelerator. The PE receives the weights and input data stored in the IRAM and performs MAC operations. In an embodiment where the row value is zero, the first and second switches are not triggered. No read request is sent to the WRAM and the data transmission path is not formed. The PE will not perform any MAC operations.

    EFFICIENT SOFTMAX COMPUTATION WITH NO LOSS IN ACCURACY

    公开(公告)号:US20240320490A1

    公开(公告)日:2024-09-26

    申请号:US18734487

    申请日:2024-06-05

    CPC classification number: G06N3/08 G06N3/048

    Abstract: A modified 2-pass version of the SoftMax operation can be implemented to address reduce computational cost without loss of accuracy, in particular for deep learning neural networks such as transformer-based neural networks and large language models (LLMs). The first pass is modified to include two scalar operations at the end. At the end of the first pass, a first scalar operation is performed to calculate a logarithm of the denominator, and a second scalar operation is performed to calculate an operand value based on a sum of the logarithm of the denominator and the maximum value. The second pass is modified to perform addition and exponentiation. In the second pass, an element of an input tensor is subtracted by the operand value to obtain an exponent, and a base is raised to the exponent. The second pass avoids divisions.

Patent Agency Ranking