-
公开(公告)号:WO2023035053A1
公开(公告)日:2023-03-16
申请号:PCT/CA2021/051241
申请日:2021-09-08
Applicant: GHAFFARI, Seyed Alireza , WU, Wei Hsiang , PARTOVI NIA, Vahid , HUAWEI TECHNOLOGIES CO., LTD.
Inventor: GHAFFARI, Seyed Alireza , WU, Wei Hsiang , PARTOVI NIA, Vahid
Abstract: Systems and methods for emulating a floating-point unit are disclosed. The method receives one or more floating-point operands having a first floating-point format. Each of the one or more floating-point operands having the first floating-point format is converted into a first set of integers having the first floating-point format. Further, each of the first set of integers is converted into a second set of integers having a second floating-point format that is different from the first floating-point format. The first set of integers and the second set of integers each has a defined bit length depending on the respective floating-point format. Lastly, the method performs computations for a task using each of the second set of integers to emulate computations performed by the floating-point unit using the one or more floating-point operands having the second floating-point format.
-
公开(公告)号:WO2021056112A1
公开(公告)日:2021-04-01
申请号:PCT/CA2020/051281
申请日:2020-09-24
Applicant: PARTOVI NIA, Vahid , RAZANI, Ryan , HUAWEI TECHNOLOGIES CO., LTD.
Inventor: PARTOVI NIA, Vahid , RAZANI, Ryan
IPC: G06N3/08 , G06F17/16 , G06N3/0454 , G06N3/0481 , G06N3/063 , G06N3/084 , G06N5/046
Abstract: Training a neural network to selectively quantize weights of a filter of the neural network as either binary weights or ternary weights. A plurality of training iterations a performed that each comprise: quantizing a set of real-valued weights of a filter to generate a corresponding set of quantized weights; generating an output feature tensor based on matrix multiplication of an input feature tensor and the set of quantized weights; computing, based on the output feature tensor, a loss based on a regularization function that is configured to move the loss towards a minimum value when either: (i) the quantized weights move towards binary weights, or (ii) the quantized weights move towards a ternary weights; computing a gradient with an objective of minimizing the loss; updating the real-valued weights based on the computed gradient. When the training iterations are complete, a set of weights quantized from the updated real-valued weights is stored as either a set of binary weights or a set of ternary weights.
-