Block floating point for neural network implementations
摘要:
Apparatus and methods are disclosed for performing block floating-point (BFP) operations, including in implementations of neural networks. All or a portion of one or more matrices or vectors can share one or more common exponents. Techniques are disclosed for selecting the shared common exponents. In some examples of the disclosed technology, a method includes producing BFP representations of matrices or vectors, at least two elements of the respective matrices or vectors sharing a common exponent, performing a mathematical operation on two or more of the plurality of matrices or vectors, and producing an output matrix or vector. Based on the output matrix or vector, one or more updated common exponents are selected, and an updated matrix or vector is produced having some elements that share the updated common exponents.
公开/授权文献
信息查询
0/0