-
公开(公告)号:US3328768A
公开(公告)日:1967-06-27
申请号:US35738764
申请日:1964-04-06
Applicant: IBM
Inventor: AMDAHL GENE M , CARNEVALE RICHARD J , COLLINS ARTHUR F , MARSH ELLIOTT R , VILLANTE ANTHONY E
IPC: G06F3/00 , G06F3/12 , G06F7/50 , G06F9/22 , G06F9/26 , G06F9/32 , G06F11/16 , G06F12/14 , G06F13/10 , G06F13/22 , G06F13/26
CPC classification number: G06F7/50 , G06F3/00 , G06F3/1295 , G06F9/226 , G06F9/26 , G06F9/30058 , G06F9/30094 , G06F11/10 , G06F12/1466 , G06F13/10 , G06F13/22 , G06F13/26
-
公开(公告)号:US3189735A
公开(公告)日:1965-06-15
申请号:US10073561
申请日:1961-04-04
Applicant: NCR CO
Inventor: GUNDERSON ROBERT O , TANG TOM T
CPC classification number: G06F7/502 , G06F7/494 , G06F7/50 , G06F2207/4924
-
公开(公告)号:US2658670A
公开(公告)日:1953-11-10
申请号:US11335849
申请日:1949-08-31
Applicant: RCA CORP
Inventor: MORTON GEORGE A , FLORY LESLIE E , SNYDER JR RICHARD L
-
公开(公告)号:US20240361984A1
公开(公告)日:2024-10-31
申请号:US18325399
申请日:2023-05-30
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Sandesh Kanchodu
CPC classification number: G06F7/5318 , G06F7/50 , G06F7/5312 , G06F7/5443
Abstract: Embodiments herein disclose high performance modulo multiplication methods performed by circuitry of an electronic device. The method includes obtaining and summing partial products to obtain a partial multiplication result using a primary Wallace tree. The partial multiplication result is fed back in a next cycle for subsequent limb multiplication associated with the primary Wallace tree. The obtaining and summing of partial products and feeding back operations are repeated until all limbs associated with the primary Wallace tree are completed. A residual computation of a partial multiplication result associated with a final limb of the primary Wallace tree is then performed, to obtain a multiplication result using a secondary Wallace tree, where the final limb stores the partial multiplication result of a last iteration.
-
公开(公告)号:US20240329930A1
公开(公告)日:2024-10-03
申请号:US18425533
申请日:2024-01-29
Inventor: Sai Manoj Pudukotai Dinakarrao , Houman Homayoun , Sathwika Bavikadi
Abstract: A processing-in-memory (PIM) system includes a plurality of PIM clusters interconnected by a router in one or more dynamic random-access memory (DRAM) banks. The PIM clusters include one or more multiply and accumulate (MAC) processing elements including a plurality of MAC lookup table cores operatively configured to perform arithmetic logic, and one or more special function (SF) processing elements, wherein the one or more SF processing elements including a plurality of SF lookup table cores operatively configured to perform one or more machine learning activation functions. The MAC lookup tables include a first arithmetic logic unit (ALU) lookup table core type operatively configured to perform addition or multiplication operations, and a second ALU lookup table core type operatively configured to simultaneously perform both addition and multiplication operations. The MAC lookup table cores and SF lookup table cores are configured to perform convolutional neural network acceleration.
-
公开(公告)号:US12106206B2
公开(公告)日:2024-10-01
申请号:US17148432
申请日:2021-01-13
Applicant: Apple Inc.
Inventor: Christopher L. Mills , Sung Hee Park
CPC classification number: G06N3/063 , G06F7/24 , G06F7/50 , G06F7/523 , G06F7/5443
Abstract: Embodiments relate to a neural engine circuit of a neural network processor circuit that performs a convolution operation on input data in a first mode and a parallel sorting operation on input data in a second mode. The neural engine circuit includes a plurality of operation circuits and an accumulator circuit coupled to the plurality of operation circuits. The plurality of operation circuits receives input data. In the first mode, the plurality of operation circuits performs multiply-add operations of a convolution on the input data using a kernel. In the second mode, the plurality of operation circuits performs a portion of a parallel sorting operation on the input data. In the first mode, the accumulator circuit receives and stores first results of the multiply-add operations. In the second mode, the accumulator circuit receives and stores second results of the parallel sorting operation.
-
公开(公告)号:US12099569B2
公开(公告)日:2024-09-24
申请号:US18337955
申请日:2023-06-20
Applicant: Applied Materials, Inc.
Inventor: Xiaofeng Zhang , She-Hwa Yen
Abstract: A method and circuit for performing vector operations may include, for each sequentially performed operation, operating a switch that corresponds to a current bit-order. Operating the switch may cause a value corresponding to an output of the operation to be stored on a capacitor corresponding to the current bit-order. A time interval during which the switch is operated may be non-uniform with respect to time intervals for other switches, and the time interval may be based at least in part on a settling time of the capacitor. The method may also include performing a bit-order weighted summation of values stored on the plurality of capacitors to generate a result of the vector operation.
-
公开(公告)号:US20240248683A1
公开(公告)日:2024-07-25
申请号:US18101038
申请日:2023-01-24
Applicant: CENTREON CORPORATION
Inventor: Kuo-Tseng TSENG , Parkson WONG , Benjamin OU
Abstract: A mixed-precision multiplication circuit that computes according to a second operand and a first operand is provided. The first operand includes an exponent and a mantissa, and the mixed-precision multiplication circuit includes a subset selector and a mantissa multiplier. The subset selector is configured to store the second operand and receive the exponent. The subset selector selects a subset from a plurality of subsets according to the exponent, with the plurality of subsets representing the second operand. The mantissa multiplier is coupled to the subset selector for receiving a multiplicand associated with the selected subset, and configured to receive the mantissa. The mantissa multiplier generates a product by performing a multiplication according to the multiplicand and the mantissa, and the mixed-precision multiplication circuit outputs a result according to the product.
-
公开(公告)号:US12039433B2
公开(公告)日:2024-07-16
申请号:US17178563
申请日:2021-02-18
Applicant: Ohio University
Inventor: Avinash Karanth , Kyle Shiflett
CPC classification number: G06N3/063 , G06F5/01 , G06F7/50 , G06F7/523 , G06F7/5443 , G06F9/5027 , G06N3/048 , G06N3/067
Abstract: Processing elements for neural network accelerators, and methods of operating the processing elements. Each of a plurality of synapse lanes outputs an electrical signal indicative of a value of a synapse. Each electrical signal is received by a respective optical AND unit including an optical microring resonator that selectively couples an optical signal indicative of the value of an input neuron based at least in part on the received electrical signal. The output of each optical AND unit is provided to either an electrical multiply and accumulate unit, or a respective interferometer of a plurality of interferometers. The interferometers are arranged in series so that optical signals are sequentially summed and shifted by each interferometer. The last interferometer outputs a shifted and accumulated sum of the outputs received from the optical AND units. In either case, the accumulated sum may then be used to generate an output neuron.
-
70.
公开(公告)号:US20240220572A1
公开(公告)日:2024-07-04
申请号:US18092183
申请日:2022-12-30
Applicant: International Business Machines Corporation
Inventor: Shubham Jain , Geoffrey Burr , HsinYu Tsai , Yasuteru Kohda , Milos Stanisavljevic
Abstract: A compute engine is configured to perform self-attention computations by delaying performance of a division operation of a softmax computation, the performance including iteratively computing a first matrix multiplication of a given row vector of a first matrix and each column vector of a second matrix while determining a first scalar element representing a maximum value of the iterative first matrix multiplications; iteratively subtracting a corresponding determined first scaler element from a result of each computed first matrix multiplication and computing an elementwise exponential function based on a result of the subtraction operation to generate a plurality of elements of a given row vector of a fourth matrix; iteratively computing a second matrix multiplication of a given row vector of the fourth matrix and each column vector of a third matrix while summing the given row vectors of the fourth matrix; and computing a row vector of an output matrix.
-
-
-
-
-
-
-
-
-