PROCESSING OF ASYMMETRICALLY QUANTIZED INPUT AND KERNEL COEFFICIENTS IN NEURAL NETWORK PROCESSOR

    公开(公告)号:US20240329929A1

    公开(公告)日:2024-10-03

    申请号:US18127528

    申请日:2023-03-28

    Applicant: Apple Inc.

    CPC classification number: G06F7/523 G06F7/50

    Abstract: Embodiments relate to performing multiply-accumulator operation on asymmetrically quantized input data and kernel data in a neural processor. Instead of adjusting to the input data at a multiply-accumulator to account for the asymmetric quantization of the input data, an adjusted bias for the multiply-accumulator operation is computed beforehand and stored in the multiply-accumulator. On the other hand, kernel coefficients derived from the kernel data are adjusted at the multiply-accumulator to account for the asymmetric quantization. In this way, computational complexity associated with asymmetric quantization may be reduced while increasing the efficiency of the convolution operations at the neural processor.

    MULTIPLY-ACCUMULATE CIRCUIT AND METHOD FOR PERFORMING MULTIPLY-ACCUMULATE OPERATIONS

    公开(公告)号:US20240296012A1

    公开(公告)日:2024-09-05

    申请号:US18588205

    申请日:2024-02-27

    CPC classification number: G06F7/5443 G06F7/50 G06F7/5306

    Abstract: A multiply-accumulate circuit for processing numerical values that are present as input words, each of which is formed from at least two partial words. The circuit is configured, corresponding to a permutation selected from a plurality of permutation possibilities implemented by the multiply-accumulate circuit, to form product partial words as products of in each case one partial word of the first input word with one partial word of the second input word, wherein in the products, the partial words of the first input word are permutated relative to their original order corresponding to the selected permutation; and to add the product partial words with an accumulation word, which is formed from one or more partial words, to determine an updated accumulation word in which product partial words are in each case added to one of the one or more partial words of the accumulation word.

    VECTOR OPERATION ACCELERATION WITH CONVOLUTION COMPUTATION UNIT

    公开(公告)号:US20240264802A1

    公开(公告)日:2024-08-08

    申请号:US18638441

    申请日:2024-04-17

    CPC classification number: G06F7/5443 G06F7/50

    Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.

    DATA ENCODING METHOD, DATA DECODING METHOD, AND DATA PROCESSING APPARATUS

    公开(公告)号:US20240235577A1

    公开(公告)日:2024-07-11

    申请号:US18618306

    申请日:2024-03-27

    CPC classification number: H03M7/6011 G06F7/50 G06F7/523 G06F7/72 H03M7/6005

    Abstract: This application relates to the field of artificial intelligence, and discloses a data encoding method, a data decoding method, and data processing apparatuses. Both the data encoding method and the data decoding method relate to an invertible flow-based model. The invertible flow-based model includes a target invertible flow layer, a model parameter of the target invertible flow layer is used to constrain an auxiliary variable generated in an inverse transform processing process, an operation corresponding to the target invertible flow layer includes a multiplication operation and a division operation that are determined based on the model parameter, and the auxiliary variable is an increment of a product of the multiplication operation or a remainder generated through the division operation.

    Vector operation acceleration with convolution computation unit

    公开(公告)号:US12020001B2

    公开(公告)日:2024-06-25

    申请号:US18130311

    申请日:2023-04-03

    CPC classification number: G06F7/5443 G06F7/50

    Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.

    CURRENT MODE HARDWARE CORES FOR MACHINE LEARNING (ML) APPLICATIONS

    公开(公告)号:US20240184524A1

    公开(公告)日:2024-06-06

    申请号:US18075366

    申请日:2022-12-05

    CPC classification number: G06F7/523 G06F7/50

    Abstract: An apparatus includes a current-mode multiply-accumulate (MAC) core with a plurality of parallel current carrying paths. Each path is configured to carry a unit current based on a state of an input variable, a weight, and a configuration vector. The plurality of current carrying paths are arranged in groups, and each group has a summation line. Also included are a plurality of current mode interfaces. Each current mode interface of the plurality of current mode interfaces is coupled to a corresponding summation line of the plurality of summation lines. A plurality of current mode comparators are coupled to the plurality of current mode interfaces and configured to compare current on the corresponding one of the plurality of summation lines to a plurality of corresponding reference currents.

    PROCESSING CIRCUIT
    9.
    发明公开
    PROCESSING CIRCUIT 审中-公开

    公开(公告)号:US20240168713A1

    公开(公告)日:2024-05-23

    申请号:US18485550

    申请日:2023-10-12

    Inventor: Erich Wenger

    CPC classification number: G06F7/523 G06F7/50 G06F7/5443

    Abstract: A processing circuit including a first multiplier to multiply least significant portions of a first and a second operand, a second multiplier to multiply a sum of a most and the least significant portion of the first operand with the sum of a most and the least significant portion of the second operand and the least significant portion of the second operand, a third multiplier to multiply the most significant portions of the first and the second operand and an output circuit to determine an output sum including the result of the first multiplier, the result of the third multiplier times two to the power of two times the bit number of the least significant portions, and, if enabled, the result of the second multiplier minus the results of the first and the third multiplier, times two to the power of the bit number of the least significant portions.

Patent Agency Ranking