Patent search ap:("Intel Corporation") AND inv:"Bogdan Pasca" Page 1

1.

发明申请
Techniques For Increasing Activation Sparsity In Artificial Neural Networks 有权

公开(公告)号：US20230021396A1

公开(公告)日：2023-01-26

申请号：US17953637

申请日：2022-09-27

Applicant: Intel Corporation

Inventor： Nihat Tunali , Arnab Raha , Bogdan Pasca , Martin Langhammer , Michael Wu , Deepak Mathaikutty

IPC: G06N3/04 , G06N3/08

Abstract: A method for implementing an artificial neural network in a computing system that comprises performing a compute operation using an input activation and a weight to generate an output activation, and modifying the output activation using a noise value to increase activation sparsity.

2.

发明授权
Integrated circuits with modular multiplication circuitry 有权

公开(公告)号：US11249726B2

公开(公告)日：2022-02-15

申请号：US16566059

申请日：2019-09-10

Applicant: Intel Corporation

Inventor： Martin Langhammer , Bogdan Pasca

IPC: G06F7/72 , G06F7/499

Abstract: An integrated circuit is provided with a modular multiplication circuit. The modular multiplication circuit includes an input multiplier for computing the product of two input signals, truncated multipliers for computing another product based on a modulus value and the product, and a subtraction circuit for computing a difference between the two products. An error correction circuit uses the difference to look up an estimated quotient value and to subtract out an integer multiple of the modulus value from the difference in a single step, wherein the integer multiple is equal to the estimated quotient value. A final adjustment stage is used to remove any remaining residual estimation error.

3.

发明授权
Implementation of floating-point trigonometric functions in an integrated circuit device 有权

公开(公告)号：US10942706B2

公开(公告)日：2021-03-09

申请号：US15633792

申请日：2017-06-27

Applicant: Intel Corporation

Inventor： Martin Langhammer , Bogdan Pasca

IPC: G06F7/483 , G06F7/548 , G06F7/499 , G06F5/01 , G06F7/50 , G06F7/523

Abstract: The present embodiments relate to integrated circuits with circuitry that implements floating-point trigonometric functions. The circuitry may include an approximation circuit that generates an approximation of the output of the trigonometric functions, a storage circuit that stores predetermined output values of the trigonometric functions, and a selector circuit that selects between different possible output values based on a control signal from a control circuit. In some embodiments, the circuitry may include a mapping circuit and a restoration circuit. The mapping circuit may map an input value from an original quadrant of the trigonometric circle to a predetermined input interval, and the restoration circuit may map the output value selected by the selection circuit back to the original quadrant of the trigonometric circle. If desired, the circuitry may be implemented in specialized processing blocks.

4.

发明授权
Machine learning training architecture for programmable devices 有权

公开(公告)号：US11210063B2

公开(公告)日：2021-12-28

申请号：US16585857

申请日：2019-09-27

Applicant: Intel Corporation

Inventor： Martin Langhammer , Bogdan Pasca , Sergey Gribok , Gregg William Baeckler , Andrei Hagiescu

IPC: G06F7/487 , G06F7/501 , H03M7/24 , G06F9/30 , G06F17/16

Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.

5.

发明授权
Methods and apparatus for performing fixed-point normalization using floating-point functional blocks 审中-公开

公开(公告)号：US10671345B2

公开(公告)日：2020-06-02

申请号：US15422966

申请日：2017-02-02

Applicant: Intel Corporation

Inventor： Bogdan Pasca

IPC: G06F5/01 , G06F7/483 , G06F7/499

Abstract: An integrated circuit may include normalization circuitry that can be used when converting a fixed-point number to a floating-point number. The normalization circuitry may include at least a floating-point generation circuit that receives the fixed-point number and that creates a corresponding floating-point number. The normalization circuitry may then leverage an embedded digital signal processing (DSP) block on the integrated circuit to perform an arithmetic operation by removing the leading one from the created floating-point number. The resulting number may have a fractional component and an exponent value, which can then be used to derive the final normalized value.

6.

发明申请
HYPERBOLIC FUNCTIONS FOR MACHINE LEARNING ACCELERATION 有权

公开(公告)号：US20220230057A1

公开(公告)日：2022-07-21

申请号：US17677556

申请日：2022-02-22

Applicant: Intel Corporation

Inventor： Bogdan Pasca , Martin Langhammer

IPC: G06N3/063 , G06N3/04 , G06F7/544 , G06F7/548

Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.

7.

发明授权
Methods for using a multiplier circuit to support multiple sub-multiplications using bit correction and extension 有权

公开(公告)号：US10732932B2

公开(公告)日：2020-08-04

申请号：US16231170

申请日：2018-12-21

Applicant: Intel Corporation

Inventor： Bogdan Pasca , Martin Langhammer , Sergey Gribok , Gregg William Baeckler

IPC: G06F7/523 , H03K19/177

Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.

8.

发明申请
REDUCED LATENCY MULTIPLIER CIRCUITRY FOR VERY LARGE NUMBERS 审中-公开

公开(公告)号：US20190310828A1

公开(公告)日：2019-10-10

申请号：US16450555

申请日：2019-06-24

Applicant: Intel Corporation

Inventor： Martin Langhammer , Bogdan Pasca

IPC: G06F7/544 , H03K19/177

Abstract: An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.

9.

发明申请
HYPERBOLIC FUNCTIONS FOR MACHINE LEARNING ACCELERATION 审中-公开

公开(公告)号：US20190042924A1

公开(公告)日：2019-02-07

申请号：US15863544

申请日：2018-01-05

Applicant: Intel Corporation

Inventor： Bogdan Pasca , Martin Langhammer

IPC: G06N3/063 , G06N3/04

Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.

10.

发明申请
METHODS FOR USING A MULTIPLIER TO SUPPORT MULTIPLE SUB-MULTIPLICATION OPERATIONS 审中-公开

公开(公告)号：US20190042198A1

公开(公告)日：2019-02-07

申请号：US16144999

申请日：2018-09-27

Applicant: Intel Corporation

Inventor： Martin Langhammer , Gregg William Baeckler , Sergey Gribok , Dmitry N. Denisenko , Bogdan Pasca

IPC: G06F7/544 , G06F7/483

Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit (e.g., an 18×18 or 18×19 multiplier circuit) may be used to support two or more smaller multiplication operations sharing one or two sets of multiplier operands, a complex multiplication, and a sum of two multiplications. If the multiplier products overflow and interfere with one another, correction operations can be performed. Partial products from two or more larger multiplier circuits can be used to combine decomposed partial products. A large multiplier circuit can also be used to support two floating-point mantissa multipliers.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification