Forward tensor and activation scaling for lower precision neural networks
Abstract:
A processing device is provided which comprises memory configured to store data and a processor configured to execute a forward activation of the neural network using a low precision floating point (FP) format, scale up values of numbers represented by the low precision FP format and process the scaled up values of the numbers as non-zero values for the numbers. The processor is configured to scale up the values of one or more numbers, via scaling parameters, to a scaled up value equal to or greater than a floor of a dynamic range of the low precision FP format. The scaling parameters are, for example, static parameters or alternatively, parameters determined during execution of the neural network.
Information query
Patent Agency Ranking
0/0