-
1.
公开(公告)号:US20240329929A1
公开(公告)日:2024-10-03
申请号:US18127528
申请日:2023-03-28
Applicant: Apple Inc.
Inventor: Lei Wang , Kenneth W. Waters , Michael L. Liu , Ji Liang Song , Youchang Kim
Abstract: Embodiments relate to performing multiply-accumulator operation on asymmetrically quantized input data and kernel data in a neural processor. Instead of adjusting to the input data at a multiply-accumulator to account for the asymmetric quantization of the input data, an adjusted bias for the multiply-accumulator operation is computed beforehand and stored in the multiply-accumulator. On the other hand, kernel coefficients derived from the kernel data are adjusted at the multiply-accumulator to account for the asymmetric quantization. In this way, computational complexity associated with asymmetric quantization may be reduced while increasing the efficiency of the convolution operations at the neural processor.
-
公开(公告)号:US20240319998A1
公开(公告)日:2024-09-26
申请号:US18187465
申请日:2023-03-21
Applicant: International Business Machines Corporation
Inventor: Satish Kumar SADASIVAM , Biplob MISHRA , Puneeth A.H. BHAT
CPC classification number: G06F9/5027 , G06F7/50 , G06F7/523 , G06F7/5443
Abstract: Systems and methods are disclosed for implementing an enhanced Matrix Math Assist (MMA) accelerator that accelerates additional Matrix Math operations of Matrix-Vector multiply and other Multiply-Add Compute operations. The Matrix Math Assist (MMA) accelerator can accelerate operations for mixed Matrix-Matrix, Matrix-Matrix and Matrix-Vector compute patterns. The MMA accelerator is an on-chip MMA accelerator built into a processor core with a set of defined registers and predefined instructions.
-
公开(公告)号:US20240296012A1
公开(公告)日:2024-09-05
申请号:US18588205
申请日:2024-02-27
Applicant: Robert Bosch GmbH
Inventor: Andre Guntoro , Michael Beyer
CPC classification number: G06F7/5443 , G06F7/50 , G06F7/5306
Abstract: A multiply-accumulate circuit for processing numerical values that are present as input words, each of which is formed from at least two partial words. The circuit is configured, corresponding to a permutation selected from a plurality of permutation possibilities implemented by the multiply-accumulate circuit, to form product partial words as products of in each case one partial word of the first input word with one partial word of the second input word, wherein in the products, the partial words of the first input word are permutated relative to their original order corresponding to the selected permutation; and to add the product partial words with an accumulation word, which is formed from one or more partial words, to determine an updated accumulation word in which product partial words are in each case added to one of the one or more partial words of the accumulation word.
-
公开(公告)号:US20240264802A1
公开(公告)日:2024-08-08
申请号:US18638441
申请日:2024-04-17
Applicant: MOFFETT INTERNATIONAL CO., LIMITED
Inventor: Xiaoqian ZHANG , Zhibin XIAO , Changxu ZHANG , Renjie CHEN
CPC classification number: G06F7/5443 , G06F7/50
Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.
-
公开(公告)号:US20240235577A1
公开(公告)日:2024-07-11
申请号:US18618306
申请日:2024-03-27
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Shifeng ZHANG , Ning KANG , Tom RYDER , Zhenguo LI
CPC classification number: H03M7/6011 , G06F7/50 , G06F7/523 , G06F7/72 , H03M7/6005
Abstract: This application relates to the field of artificial intelligence, and discloses a data encoding method, a data decoding method, and data processing apparatuses. Both the data encoding method and the data decoding method relate to an invertible flow-based model. The invertible flow-based model includes a target invertible flow layer, a model parameter of the target invertible flow layer is used to constrain an auxiliary variable generated in an inverse transform processing process, an operation corresponding to the target invertible flow layer includes a multiplication operation and a division operation that are determined based on the model parameter, and the auxiliary variable is an increment of a product of the multiplication operation or a remainder generated through the division operation.
-
公开(公告)号:US20240231756A9
公开(公告)日:2024-07-11
申请号:US18278451
申请日:2022-02-24
Applicant: SEMICONDUCTOR ENERGY LABORATORY CO., LTD.
Inventor: Yoshiyuki KUROKAWA , Hiromichi GODO , Kazuki TSUDA , Satoru OHSHITA , Hidefumi RIKIMARU
IPC: G06F7/523 , G06F7/50 , G09G3/3208 , G11C11/405 , H10B12/00 , H10K59/121
CPC classification number: G06F7/523 , G06F7/50 , G09G3/3208 , G11C11/405 , H10B12/00 , H10K59/1213 , H10K59/1216
Abstract: A semiconductor device with a novel structure is provided. The semiconductor device includes a cell array performing a product-sum operation of a first layer and a product-sum operation of a second layer in an artificial neural network, a first circuit from which first data is input to the cell array, and a second circuit to which second data is output from the cell array. The cell array includes a plurality of cells. The cell array includes a first region and a second region. In a first period, the first region is supplied with the t-th (t is a natural number greater than or equal to 2) first data from the first circuit and outputs the t-th second data according to the product-sum operation of the first layer to the second circuit. In the first period, the second region is supplied with the (t+1)-th first data from the first circuit and outputs the (t+1)-th second data according to the product-sum operation of the second layer to the first circuit.
-
公开(公告)号:US12020001B2
公开(公告)日:2024-06-25
申请号:US18130311
申请日:2023-04-03
Applicant: MOFFETT INTERNATIONAL CO., LIMITED
Inventor: Xiaoqian Zhang , Zhibin Xiao , Changxu Zhang , Renjie Chen
CPC classification number: G06F7/5443 , G06F7/50
Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.
-
公开(公告)号:US20240184524A1
公开(公告)日:2024-06-06
申请号:US18075366
申请日:2022-12-05
Applicant: International Business Machines Corporation
Inventor: Sudipto Chakraborty , Rajiv Joshi
Abstract: An apparatus includes a current-mode multiply-accumulate (MAC) core with a plurality of parallel current carrying paths. Each path is configured to carry a unit current based on a state of an input variable, a weight, and a configuration vector. The plurality of current carrying paths are arranged in groups, and each group has a summation line. Also included are a plurality of current mode interfaces. Each current mode interface of the plurality of current mode interfaces is coupled to a corresponding summation line of the plurality of summation lines. A plurality of current mode comparators are coupled to the plurality of current mode interfaces and configured to compare current on the corresponding one of the plurality of summation lines to a plurality of corresponding reference currents.
-
公开(公告)号:US20240168713A1
公开(公告)日:2024-05-23
申请号:US18485550
申请日:2023-10-12
Applicant: Infineon Technologies AG
Inventor: Erich Wenger
CPC classification number: G06F7/523 , G06F7/50 , G06F7/5443
Abstract: A processing circuit including a first multiplier to multiply least significant portions of a first and a second operand, a second multiplier to multiply a sum of a most and the least significant portion of the first operand with the sum of a most and the least significant portion of the second operand and the least significant portion of the second operand, a third multiplier to multiply the most significant portions of the first and the second operand and an output circuit to determine an output sum including the result of the first multiplier, the result of the third multiplier times two to the power of two times the bit number of the least significant portions, and, if enabled, the result of the second multiplier minus the results of the first and the third multiplier, times two to the power of the bit number of the least significant portions.
-
公开(公告)号:US20240152486A1
公开(公告)日:2024-05-09
申请号:US18415958
申请日:2024-01-18
Applicant: Cornami, Inc.
Inventor: Paul L. Master , Steven K. Knapp , Raymond J. Andraka , Alexei Beliaev , Martin A. Franz , Rene Meessen , Frederick Curtis Furtek
IPC: G06F15/80 , G06F5/01 , G06F7/487 , G06F7/50 , G06F7/52 , G06F7/523 , G06F7/544 , G06F9/30 , G06F9/38 , G06F9/48 , G06F9/54 , H03K19/21
CPC classification number: G06F15/80 , G06F5/01 , G06F7/487 , G06F7/50 , G06F7/52 , G06F7/523 , G06F7/5443 , G06F9/30098 , G06F9/3856 , G06F9/4881 , G06F9/54 , H03K19/21 , G06F2207/382
Abstract: A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.
-
-
-
-
-
-
-
-
-