Activation function computation for neural networks

    公开(公告)号:US11562235B2

    公开(公告)日:2023-01-24

    申请号:US16797587

    申请日:2020-02-21

    Abstract: A computer-implemented method for improving the efficiency of computing an activation function in a neural network system includes initializing, by a controller, weights in a weight vector associated with the neural network system. Further, the method includes receiving, by the controller, an input vector of input values for computing a dot product with the weight vector for the activation function, which determines an output value of a node in the neural network system. The method further includes predicting, by a rectifier linear unit (ReLU), which computes the activation function, that the output value of the node will be negative based on computing an intermediate value for computing the dot product, and based on a magnitude of the intermediate value exceeding a precomputed threshold value. Further, the method includes, in response to the prediction, terminating, by the ReLU, the computation of the dot product, and outputting a 0 as the output value.

    Loop management in multi-processor dataflow architecture

    公开(公告)号:US11138010B1

    公开(公告)日:2021-10-05

    申请号:US17060788

    申请日:2020-10-01

    Abstract: Embodiments of the present invention include a computer system that manages execution of one or more programs with one or more loops where each loop having a loop level. Embodiments that manage loops that can skip execution and the number of loops changing during execution are also disclosed. A loop level register (LLEV) stores the loop level for a currently executing loop. A Loop-Back Program Counter Register (LBPR) has a table of one or more Loop-Back Registers. Each Loop-Back Register stores the loop level for a LBPR respective loop and a loop back PC location for the LBPR respective loop. A Program Counter points back to the PC location for each iteration of the loop. A Loop Current Count Register table (LCCR) tracks a number of iterations remaining to executed for of the loop. A loop management process causes one of the CPUs to execute all the one or more instructions of an iteration of the currently executing program loop. When all iterations of the executing loop are complete, the LLEV is decremented to a next loop level that contained the executed loop.

    MATRIX MULTIPLICATION ON A SYSTOLIC ARRAY
    86.
    发明申请

    公开(公告)号:US20200012706A1

    公开(公告)日:2020-01-09

    申请号:US16576144

    申请日:2019-09-19

    Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.

    Computational Efficiency in Symbolic Sequence Analytics Using Random Sequence Embeddings

    公开(公告)号:US20190340542A1

    公开(公告)日:2019-11-07

    申请号:US15972108

    申请日:2018-05-04

    Abstract: A method and system of analyzing a symbolic sequence is provided. Metadata of a symbolic sequence is received from a computing device of an owner. A set of R random sequences are generated based on the received metadata and sent to the computing device of the owner of the symbolic sequence for computation of a feature matrix based on the set of R random sequences and the symbolic sequence. The feature matrix is received from the computing device of the owner. Upon determining that an inner product of the feature matrix is below a threshold accuracy, the iterative process returns to generating R random sequences. Upon determining that the inner product of the feature matrix is at or above the threshold accuracy, the feature matrix is categorized based on machine learning. The categorized global feature matrix is sent to be displayed on a user interface of the computing device of the owner.

    MATRIX MULTIPLICATION ON A SYSTOLIC ARRAY
    88.
    发明申请

    公开(公告)号:US20190236113A1

    公开(公告)日:2019-08-01

    申请号:US16381530

    申请日:2019-04-11

    CPC classification number: G06F17/16

    Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.

    Matrix multiplication on a systolic array

    公开(公告)号:US10241972B2

    公开(公告)日:2019-03-26

    申请号:US15460755

    申请日:2017-03-16

    Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.

Patent Agency Ranking