-
公开(公告)号:US20240329931A1
公开(公告)日:2024-10-03
申请号:US18617685
申请日:2024-03-27
摘要: A signal processing method includes: an analysis object data generation step of generating i-th analysis object data based on time-series data of a physical quantity detected by an i-th sensor; a product-sum operation step of generating product-sum operation data of template data including a signal component to be analyzed and M-th analysis object data; and a synchronous-timing detection step of detecting a synchronous timing, which is timing synchronizing with the signal component, based on the product-sum operation data. The template data is shorter than the M-th analysis object data, and a sampling rate of the M-th analysis object data is equal to a sampling rate of the template data.
-
公开(公告)号:US12093809B2
公开(公告)日:2024-09-17
申请号:US17184432
申请日:2021-02-24
CPC分类号: G06N3/063 , G06F7/50 , G06F7/523 , G06F7/5443 , G06F2207/4812
摘要: An arithmetic device includes N product-sum-operation circuits, a control circuit, and an output circuit. Each product-sum-operation circuit outputs intermediate signals obtained by binarizing a product-sum-operation value obtained by product-sum-operation of M input values of M input signals and M weight values. The control circuit inverts positive/negative of each M weight value at determining-timing when a given time elapses from input timing. Based on a delay time from the determination-timing to logic finalization of the intermediate signal for each N product-sum-operation circuit, the output circuit outputs an output signal representing a winner-product-sum-operation circuit for which the product-sum-operation value having a sign and the largest absolute value is calculated. Each N product-sum-operation circuit starts the product-sum-operation from the input-timing and the determination-timing, and outputs an intermediate signal for which a propagation-delay-time from starting of the product-sum-operation to inversion of the logic corresponds to the absolute value of the product-sum-operation value.
-
公开(公告)号:US12086206B2
公开(公告)日:2024-09-10
申请号:US17391045
申请日:2021-08-02
申请人: NEUCHIPS CORPORATION
发明人: Jian-Wen Chen , YuShan Ruan , Chih-Wei Chang , Youn-Long Lin
CPC分类号: G06F17/16 , G06F7/50 , G06F7/523 , G06F7/5443 , G06N3/063
摘要: A matrix multiplier and an operation method thereof are provided. The matrix multiplier includes a plurality of first input lines, a plurality of second input lines and a computing array. The computing array includes a plurality of multiplication accumulation (MAC) cells. A first MAC cell of the plurality of MAC cells is coupled to a first corresponding input line of the plurality of first input lines and a second corresponding input line of the plurality of second input lines to receive a first input value and a second input value to perform a multiplication accumulation operation. When at least one of the first input value and the second input value is a specified value, the multiplication accumulation operation of the first MAC cell is disabled.
-
公开(公告)号:US20240296010A1
公开(公告)日:2024-09-05
申请号:US18591349
申请日:2024-02-29
申请人: Graphcore Limited
发明人: Thomas BROWN
CPC分类号: G06F7/49915 , G06F7/523 , G06F7/556
摘要: A processing unit is provided with circuitry enabling execution quick evaluation of an exponential function. A multiplier circuit is used to multiply the input operand by log2(e), such that a result for the exponential function may be determined by evaluating 2i+f, where i is an integer part of a fixed-point number and f is a fractional part of the fixed-point number. A lookup table is used for providing an estimate for 2f based on the l MSBs of f. The lookup entries are provided according to a function such that the estimates for 2f are provided without bias towards either zero or infinity in the result. In other words, the maximum multiplicative error for each entry of the lookup table is the same in both negative and positive directions. In this way, statistical errors in the evaluation of a large number of exponential functions may be avoided.
-
5.
公开(公告)号:US20240289092A1
公开(公告)日:2024-08-29
申请号:US17620583
申请日:2020-10-13
发明人: Yao ZHANG , Shaoli LIU
IPC分类号: G06F7/523
CPC分类号: G06F7/523
摘要: The present disclosure relates to a multiplier, a method, an integrated circuit chip, and a computation apparatus for a floating-point computation. The computation apparatus is included in a combined processing apparatus. The combined processing apparatus further includes a universal interconnection interface and other processing apparatus. The computation apparatus interacts with other processing apparatus to jointly complete computing operations specified by the user. The combined processing apparatus also includes a storage apparatus. The storage apparatus is respectively connected to the computation apparatus and other processing apparatus and is used for storing data of the computation apparatus and other processing apparatus. Solutions of the present disclosure is widely used in various floating-point data computations.
-
公开(公告)号:US12072952B2
公开(公告)日:2024-08-27
申请号:US17214779
申请日:2021-03-26
IPC分类号: G06F17/16 , G06F7/523 , G06F7/544 , H03K19/173
CPC分类号: G06F17/16 , G06F7/523 , G06F7/5443 , H03K19/1737
摘要: A processing device is provided which comprises memory configured to store data and a processor. The processor comprises a plurality of MACs configured to perform matrix multiplication of elements of a first matrix and elements of a second matrix. The processor also comprises a plurality of logic devices configured to sum values of bits of product exponents values of the elements of the first matrix and second matrix and determine keep bit values for product exponents values to be kept for matrix multiplication. The processor also comprises a plurality of multiplexor arrays each configured to receive bits of the elements of the first matrix and the second matrix and the keep bit values and provide data for selecting which elements of the first matrix and the second matrix values are provided to the MACs for matrix multiplication.
-
公开(公告)号:US12056530B2
公开(公告)日:2024-08-06
申请号:US18014635
申请日:2020-11-03
发明人: Bing Xu , Nangeng Zhang
CPC分类号: G06F9/5027 , G06F7/50 , G06F7/523
摘要: A dilated convolution acceleration calculation method and apparatus. The method comprises: decomposing a dilated convolution computation of R×S into S sub-dilated convolution computations of R×1 (301); for each sub-dilated convolution computation, caching a plurality of weight values in parallel to a plurality of calculation units in a calculation unit array (302); determining, from input image data, a plurality of input data streams respectively corresponding to the plurality of weight values, and inputting the plurality of input data streams in parallel into the plurality of calculation units (303); within the plurality of calculation units, executing a sliding window operation and a multiplication operation on the basis of the cached weight values and the input data streams, and executing an accumulation operation between the plurality of calculation units, so as to output an intermediate result of the sub-dilated convolution computation (304); and superimposing intermediate results of the S sub-dilated convolution computations of R×1, so as to obtain a convolution result of the dilated convolution computation (305). By using the method, a dilated convolution operation is accelerated with a relatively low complexity, and the function of Im2col does not need to be separately realized, thereby reducing the complexity.
-
公开(公告)号:US20240232286A9
公开(公告)日:2024-07-11
申请号:US18076407
申请日:2022-12-07
申请人: NEUCHIPS CORPORATION
发明人: Chiung-Liang Lin , YuShan Ruan , Huan Jan Chou
CPC分类号: G06F17/16 , G06F7/50 , G06F7/523 , G06F7/5443 , G06F7/78
摘要: A matrix computing device and an operation method for the matrix computing device are provided. The matrix computing device includes a storage unit, a control circuit, and a computing circuit. The storage unit includes a weight matrix. The control circuit re-orders an arrangement order of weights in the weight matrix according to a shape of an output matrix to determine a weight readout order of the weights. The computing circuit receives the weights based on the weight readout order, and performs a matrix computation on the weights and an input matrix to generate a computing matrix. The control circuit performs a reshape transformation on the computing matrix to generate the output matrix, and writes the output matrix to the storage unit.
-
公开(公告)号:US20240231758A1
公开(公告)日:2024-07-11
申请号:US18417868
申请日:2024-01-19
申请人: Apple Inc.
发明人: Shahzad Nazar , Bharan Giridhar , Mohamed H. Abu-Rahma , Ajay Bhatia , Mayur V. Joshi , Yildiz Sinangil , Aravind Kandala
CPC分类号: G06F7/5443 , G06F7/523 , G06F17/15 , H03M1/46 , G06N20/00
摘要: A compute-memory circuit included in a computer system includes multiple data storage cells and multiplier circuits. The data storage cells store weight values associated with a first operand. The multiplier circuits are coupled to a global bit line and receive the weight values via local bit lines coupled to the data storage cells. Using the received weight values and activation signals indicative of a second operand, the multiplier circuits modify a voltage level of global bit line. The resultant voltage level on the global bit line is indicative of a product of the first and second operands, and can be converted to a digital value using an analog-to-digital converter circuit. By performing computation on global rather than local bit lines, standard data storage cells can be employed, improving the area efficiency of the compute-memory circuit.
-
10.
公开(公告)号:US12032926B2
公开(公告)日:2024-07-09
申请号:US17334887
申请日:2021-05-31
发明人: Martin Kraemer , Ryan Boesch , Wei Xiong
CPC分类号: G06F7/5443 , G06F7/523 , H03K19/21 , H03M1/462
摘要: An architecture for a chopper stabilized multiplier-accumulator (MAC) uses a chop clock and common Unit Element (UE), the MAC formed as a plurality of MAC UEs receiving X and W values and a sign bit exclusive ORed with the chop clock, a plurality of Bias UEs receiving E value and a sign bit exclusive ORed with the chop clock, and a plurality of Analog to Digital Conversion (ADC) UEs which collectively perform a scalable MAC operation and generate a binary result. Each MAC UE, BIAS UE and ADC UE comprises groups of NAND gates with complementary outputs arranged in NAND-groups, each NAND gate coupled to a differential charge transfer bus through a binary weighted charge transfer capacitor. The analog charge transfer bus is coupled to groups of ADC UEs with an ADC controller which enables and disables the ADC UEs using successive approximation to determine the accumulated multiplication result.
-
-
-
-
-
-
-
-
-