-
公开(公告)号:US20190212983A1
公开(公告)日:2019-07-11
申请号:US16299337
申请日:2019-03-12
CPC分类号: G06F7/4915 , G06F7/4876 , G06F7/5443
摘要: A method to produce a final product from a multiplicand and a multiplier is provided. The method is executed by a parallel decimal multiplication hardware architecture, which includes a 3× generator, at least one additional generator, a multiplier recoder, a partial product tree, and a decimal adder. The 3× generator, the at least one additional generator, and the multiplier recoder generate decimal partial products from the multiplicand and the multiplier. The partial product tree executes a reduction of the decimal partial products to produce two corresponding partial product accumulations. The decimal adder adds the two corresponding partial product accumulations of the decimal partial products to produce the final product.
-
公开(公告)号:US20190205094A1
公开(公告)日:2019-07-04
申请号:US15857998
申请日:2017-12-29
申请人: Facebook, Inc.
IPC分类号: G06F7/523
CPC分类号: G06F7/523 , G06F7/5443 , G06F2207/382 , G06N3/0481 , G06N3/063 , G06N3/08 , G06N5/022
摘要: The disclosed method may include (1) receiving a precision level of each weight associated with each input of a node of a computational model, (2) identifying, for each weight, one of a plurality of multiplier groups, where each multiplier group may include a plurality of hardware multipliers of a corresponding bit width, and where the corresponding bit width of the plurality of hardware multipliers of the one of the plurality of multiplier groups may be sufficient to multiply the weight by the associated input, and (3) multiplying each weight by its associated input using an available hardware multiplier of the one of the plurality of multiplier groups identified for the weight. Various other processing elements, methods, and systems are also disclosed.
-
公开(公告)号:US20190026076A1
公开(公告)日:2019-01-24
申请号:US16039221
申请日:2018-07-18
发明人: Cong Leng , Hao Li , Zesheng Dou , Shenghuo ZHU , Rong JIN
CPC分类号: G06F5/01 , G06F5/08 , G06F7/5443 , G06F7/556 , G06F15/78 , G06F2207/4824 , G06N3/02 , G06N3/063 , G06N3/08
摘要: A method including receiving, by a processor, a computing instruction for a neural network, wherein the computing instruction for the neural network includes a computing rule for the neural network and a connection weight of the neural network, and the connection weight is a power of 2; and inputting, for a multiplication operation in the computing rule for the neural network, a source operand corresponding to the multiplication operation to a shift register, and performing a shift operation based on a connection weight corresponding to the multiplication operation, wherein the shift register outputs a target result operand as a result of the multiplication operation. The neural network uses a shift operation, and a neural network computing speed is increased.
-
公开(公告)号:US20190012296A1
公开(公告)日:2019-01-10
申请号:US16017813
申请日:2018-06-25
发明人: Pei-Wen HSIEH , Chen-Chu HSU , Tsung-Liang CHEN
CPC分类号: G06F17/16 , G06F5/01 , G06F7/5443 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/063 , H03M7/30 , H03M7/3082 , H03M7/6023
摘要: A method for matrix by vector multiplication, applied in an artificial neural network system, is disclosed. The method comprises: compressing a plurality of weight values in a weight matrix and indices of an input vector into a compressed main stream; storing M sets of synapse values in M memory devices; and, performing reading and MAC operations according to the M sets of synapse values and the compressed main stream to obtain a number M of output vectors. The step of compressing comprises: dividing the weight matrix into a plurality of N×L blocks; converting entries of a target block and corresponding indices of the input vector into a working block and an index matrix; removing zero entries in the working block; shifting non-zero entries row-by-row to one of their left and right sides in the working block; and, respectively shifting corresponding entries in the index matrix.
-
公开(公告)号:US10073676B2
公开(公告)日:2018-09-11
申请号:US15272231
申请日:2016-09-21
申请人: Altera Corporation
发明人: Martin Langhammer
CPC分类号: G06F7/485 , G06F7/4876 , G06F7/49957 , G06F7/5443 , G06F9/30014 , G06F17/16 , G06F2207/382 , G06F2207/483
摘要: The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals. A compressor circuit may generate carry and sum vector signals based on the first and second partial products; and circuitry may anticipate rounding and normalization operations by generating in parallel based on the carry and sum vector signals at least two results when performing the single-precision floating-point operation and at least four results when performing the two half-precision floating-point operations.
-
公开(公告)号:US20180203667A1
公开(公告)日:2018-07-19
申请号:US15406910
申请日:2017-01-16
CPC分类号: G06F7/483 , G06F5/012 , G06F7/5443
摘要: A floating-point unit, configured to implement a fused-multiply-add operation on three 128 bit wide operands is provided, which includes a 113×113-bit multiplier; a left shifter; a right shifter; a select circuit including a 3-to-2 compressor; an adder connected to the dataflow from the select circuit; a first feedback path connecting a carry output of the adder to the select circuit; a second feedback path connecting the output of the adder to the shifters for passing an intermediate wide result through the shifters.
-
公开(公告)号:US10019234B2
公开(公告)日:2018-07-10
申请号:US14875323
申请日:2015-10-05
申请人: Altera Corporation
IPC分类号: G06F7/544
CPC分类号: G06F7/5443 , G06F2207/3868
摘要: An integrated circuit may have specialized processing blocks that are configurable to operate as arithmetic operators that may implement, amongst other functions, multiplication and multiply-accumulation operations in a first mode. In a second mode, a sequencer circuit may provide data signals and control signals to the specialized processing blocks such that the specialized processing block operates as a signal processing device that handles signals in a given sequence. For example, the sequencer circuit may control the signal arrival at the specialized processing block and the configuration of the configurable circuitry in the specialized processing block. In certain embodiments, the sequencer circuit and the specialized processing block may implement finite impulse response (FIR) filters.
-
公开(公告)号:US10019230B2
公开(公告)日:2018-07-10
申请号:US14748956
申请日:2015-06-24
发明人: Thomas Elmer
CPC分类号: G06F7/483 , G06F7/485 , G06F7/4876 , G06F7/49957 , G06F7/5443 , G06F9/3001 , G06F9/30014 , G06F9/3017 , G06F9/30185 , G06F9/38 , G06F9/3893 , G06F17/16
摘要: An arithmetic operation is performed using a first instruction execution unit to generate an intermediate result vector and a plurality of calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The intermediate result vector and the plurality of calculation control indicators are stored in memory external to the instruction execution unit, and later read by a second instruction execution unit to complete the arithmetic operation.
-
公开(公告)号:US10019227B2
公开(公告)日:2018-07-10
申请号:US14547180
申请日:2014-11-19
发明人: Oliver Draese , Michael M. Skubowius , Knut Stolze
CPC分类号: G06F7/483 , G06F7/485 , G06F7/49905 , G06F7/5443
摘要: A method for enhancing an accuracy of a sum of a plurality of floating-point numbers. The method receives a floating-point number and generates a plurality of provisional numbers with a value of zero. The method further generates a surjective map from the values of an exponent and a sign of a mantissa to the provisional numbers in the plurality of provisional numbers. The method further maps a value of the exponent and the sign of the mantissa to a first provisional number with the surjective map. The method further generates a test number from the first provisional number and if the test number exceeds a limit, modifies a second provisional number by using at least part of the test number. The method further equates the first provisional number to the test number if the test number does not exceed the limit. The method further sums the plurality of provisional numbers.
-
公开(公告)号:US20180181406A1
公开(公告)日:2018-06-28
申请号:US15782881
申请日:2017-10-13
申请人: FUJITSU LIMITED
发明人: Masahiro Kuramoto
CPC分类号: G06F9/3887 , G06F7/5443 , G06F9/3001 , G06F9/30032 , G06F15/8015 , G06F17/15 , G06N3/063 , G06N3/084
摘要: Each of product-sum arithmetic units 501 to 503 acquires, from a register file 410, different pieces of first element data included in a first predetermined row of first data that forms a matrix; acquires, from a register file 420, same pieces of second element data included in a second predetermined row of second data that forms a matrix; performs a row portion operation that is an operation performed on the first data by an amount corresponding to a single row by performing a process of performing an operation using the acquired first element data and the second element data; and performs an operation by using the first data and the second data based on the result of the row portion operation.
-
-
-
-
-
-
-
-
-