Forward tensor and activation scaling for lower precision neural networks

    公开(公告)号:US12153930B2

    公开(公告)日:2024-11-26

    申请号:US17565391

    申请日:2021-12-29

    Inventor: Hai Xiao

    Abstract: A processing device is provided which comprises memory configured to store data and a processor configured to execute a forward activation of the neural network using a low precision floating point (FP) format, scale up values of numbers represented by the low precision FP format and process the scaled up values of the numbers as non-zero values for the numbers. The processor is configured to scale up the values of one or more numbers, via scaling parameters, to a scaled up value equal to or greater than a floor of a dynamic range of the low precision FP format. The scaling parameters are, for example, static parameters or alternatively, parameters determined during execution of the neural network.

    FORWARD TENSOR AND ACTIVATION SCALING FOR LOWER PRECISION NEURAL NETWORKS

    公开(公告)号:US20230205544A1

    公开(公告)日:2023-06-29

    申请号:US17565391

    申请日:2021-12-29

    Inventor: Hai Xiao

    CPC classification number: G06F9/3887 G06F9/3555 G06K9/6256 G06N3/04

    Abstract: A processing device is provided which comprises memory configured to store data and a processor configured to execute a forward activation of the neural network using a low precision floating point (FP) format, scale up values of numbers represented by the low precision FP format and process the scaled up values of the numbers as non-zero values for the numbers. The processor is configured to scale up the values of one or more numbers, via scaling parameters, to a scaled up value equal to or greater than a floor of a dynamic range of the low precision FP format. The scaling parameters are, for example, static parameters or alternatively, parameters determined during execution of the neural network.

    Neural Network Activation Scaled Clipping Layer

    公开(公告)号:US20230409868A1

    公开(公告)日:2023-12-21

    申请号:US17844204

    申请日:2022-06-20

    CPC classification number: G06N3/04 G06N3/08

    Abstract: Activation scaled clipping layers for neural networks are described. An activation scaled clipping layer processes an output of a neuron in a neural network using a scaling parameter and a clipping parameter. The scaling parameter defines how numerical values are amplified relative to zero. The clipping parameter specifies a numerical threshold that causes the neuron output to be expressed as a value defined by the numerical threshold if the neuron output satisfies the numerical threshold. In some implementations, the scaling parameter is linear and treats numbers within a numerical range as being equivalent, such that any number in the range is scaled by a defined magnitude, regardless of value. Alternatively, the scaling parameter is nonlinear, which causes the activation scaled clipping layer to amplify numbers within a range by different magnitudes. Each scaling and clipping parameter is learnable during training of a machine learning model implementing the neural network.

Patent Agency Ranking