METHOD AND SYSTEM FOR EMULATING A FLOATING-POINT UNIT

    公开(公告)号:WO2023035053A1

    公开(公告)日:2023-03-16

    申请号:PCT/CA2021/051241

    申请日:2021-09-08

    Abstract: Systems and methods for emulating a floating-point unit are disclosed. The method receives one or more floating-point operands having a first floating-point format. Each of the one or more floating-point operands having the first floating-point format is converted into a first set of integers having the first floating-point format. Further, each of the first set of integers is converted into a second set of integers having a second floating-point format that is different from the first floating-point format. The first set of integers and the second set of integers each has a defined bit length depending on the respective floating-point format. Lastly, the method performs computations for a task using each of the second set of integers to emulate computations performed by the floating-point unit using the one or more floating-point operands having the second floating-point format.

    TRAINING METHOD FOR QUANTIZING THE WEIGHTS AND INPUTS OF A NEURAL NETWORK

    公开(公告)号:WO2021056112A1

    公开(公告)日:2021-04-01

    申请号:PCT/CA2020/051281

    申请日:2020-09-24

    Abstract: Training a neural network to selectively quantize weights of a filter of the neural network as either binary weights or ternary weights. A plurality of training iterations a performed that each comprise: quantizing a set of real-valued weights of a filter to generate a corresponding set of quantized weights; generating an output feature tensor based on matrix multiplication of an input feature tensor and the set of quantized weights; computing, based on the output feature tensor, a loss based on a regularization function that is configured to move the loss towards a minimum value when either: (i) the quantized weights move towards binary weights, or (ii) the quantized weights move towards a ternary weights; computing a gradient with an objective of minimizing the loss; updating the real-valued weights based on the computed gradient. When the training iterations are complete, a set of weights quantized from the updated real-valued weights is stored as either a set of binary weights or a set of ternary weights.

Patent Agency Ranking