FOUR-BIT TRAINING FOR MACHINE LEARNING

    公开(公告)号:US20220180171A1

    公开(公告)日:2022-06-09

    申请号:US17112528

    申请日:2020-12-04

    摘要: An apparatus includes a floating-point gradient register; an integer register; a memory bank; and an array of processing units. Each of the units includes a plurality of binary shifters having an integer input configured to obtain corresponding bits of a 4-bit integer multiplicand, and a shift-specifying input configured to obtain corresponding bits in an exponent field of a 4-bit floating point multiplier. The multiplier is specified in a mantissaless four-bit floating point format including a sign bit, three exponent bits, and no mantissa bits. An adder tree has a plurality of inputs coupled to outputs of the plurality of shifters, and a rounder has an input coupled to an output of the adder tree. The integer inputs are connected to the integer register; the shift-specifying inputs are connected to the floating-point gradient register; and outputs of the rounders are coupled to the memory bank.