-
公开(公告)号:US11010131B2
公开(公告)日:2021-05-18
申请号:US15704313
申请日:2017-09-14
Applicant: Intel Corporation
Inventor: Martin Langhammer , Bogdan Pasca
IPC: G06F7/485 , G06F5/01 , G06F7/487 , H03K19/17736 , H03K19/17724 , G06F7/499 , H03K19/1776
Abstract: An integrated circuit may include a floating-point adder. The adder may be implemented using a dual-path adder architecture having a near path and a far path. The near path may include a leading zero anticipator (LZA), a comparison circuit for comparing an exponent value to an LZA count, and associated circuitry for handling subnormal numbers. The far path may include a subtraction circuit for computing the difference between a received exponent value and a minimum exponent value, at least two shifters for shifting far greater and far lesser mantissa values in parallel, and associated circuitry for handling subnormal numbers. The adder may be dynamically configured to support a first mode that processes FP16 at inputs and outputs, a second mode that processes modified FP16′ inputs, and a third mode that processes FP16′ at inputs and outputs.
-
22.
公开(公告)号:US20200142671A1
公开(公告)日:2020-05-07
申请号:US16231170
申请日:2018-12-21
Applicant: Intel Corporation
Inventor: Bogdan Pasca , Martin Langhammer , Sergey Gribok , Gregg William Baeckler
IPC: G06F7/523 , H03K19/177
Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.
-
公开(公告)号:US20200026494A1
公开(公告)日:2020-01-23
申请号:US16585857
申请日:2019-09-27
Applicant: Intel Corporation
Inventor: Martin Langhammer , Bogdan Pasca , Sergey Gribok , Gregg William Baeckler , Andrei Hagiescu
Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.
-
公开(公告)号:US20190079728A1
公开(公告)日:2019-03-14
申请号:US15704313
申请日:2017-09-14
Applicant: Intel Corporation
Inventor: Martin Langhammer , Bogdan Pasca
IPC: G06F7/485 , H03K19/177 , G06F5/01
Abstract: An integrated circuit may include a floating-point adder. The adder may be implemented using a dual-path adder architecture having a near path and a far path. The near path may include a leading zero anticipator (LZA), a comparison circuit for comparing an exponent value to an LZA count, and associated circuitry for handling subnormal numbers. The far path may include a subtraction circuit for computing the difference between a received exponent value and a minimum exponent value, at least two shifters for shifting far greater and far lesser mantissa values in parallel, and associated circuitry for handling subnormal numbers. The adder may be dynamically configured to support a first mode that processes FP16 at inputs and outputs, a second mode that processes modified FP16′ inputs, and a third mode that processes FP16′ at inputs and outputs.
-
公开(公告)号:US20190018673A1
公开(公告)日:2019-01-17
申请号:US15842343
申请日:2017-12-14
Applicant: Intel Corporation
Inventor: Martin Langhammer , Gregg William Baeckler , Bogdan Pasca
Abstract: Adder trees may be constructed for efficient packing of arithmetic operators into an integrated circuit. The operands of the trees may be truncated to pack an integer number of nodes per logic array block. As a result, arithmetic operations may pack more efficiently onto the integrated circuit while providing increased precision and performance.
-
26.
公开(公告)号:US20180321910A1
公开(公告)日:2018-11-08
申请号:US15633792
申请日:2017-06-27
Applicant: Intel Corporation
Inventor: Martin Langhammer , Bogdan Pasca
Abstract: The present embodiments relate to integrated circuits with circuitry that implements floating-point trigonometric functions. The circuitry may include an approximation circuit that generates an approximation of the output of the trigonometric functions, a storage circuit that stores predetermined output values of the trigonometric functions, and a selector circuit that selects between different possible output values based on a control signal from a control circuit. In some embodiments, the circuitry may include a mapping circuit and a restoration circuit. The mapping circuit may map an input value from an original quadrant of the trigonometric circle to a predetermined input interval, and the restoration circuit may map the output value selected by the selection circuit back to the original quadrant of the trigonometric circle. If desired, the circuitry may be implemented in specialized processing blocks.
-
-
-
-
-