Patent search ap:("Intel Corporation") AND inv:"Martin Power" Page 2

11.

发明申请
AREA AND ENERGY EFFICIENT MULTI-PRECISION MULTIPLY-ACCUMULATE UNIT-BASED PROCESSOR 有权

公开(公告)号：US20210397414A1

公开(公告)日：2021-12-23

申请号：US17358868

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Arnab Raha , Mark A. Anders , Martin Power , Martin Langhammer , Himanshu Kaul , Debabrata Mohapatra , Gautham Chinya , Cormac Brick , Ram Krishnamurthy

IPC: G06F7/544 , G06F7/527 , G06F5/01

Abstract: Systems, apparatuses and methods may provide for multi-precision multiply-accumulate (MAC) technology that includes a plurality of arithmetic blocks, wherein the plurality of arithmetic blocks each contain multiple multipliers, and wherein the logic is to combine multipliers one or more of within each arithmetic block or across multiple arithmetic blocks. In one example, one or more intermediate multipliers are of a size that is less than precisions supported by arithmetic blocks containing the one or more intermediate multipliers.

12.

发明授权
Methods, apparatus, and articles of manufacture to increase data reuse for multiply and accumulate (MAC) operations 有权

公开(公告)号：US12169643B2

公开(公告)日：2024-12-17

申请号：US18465560

申请日：2023-09-12

Applicant: Intel Corporation

Inventor： Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick

IPC: G06F3/06 , G06F7/544

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.

13.

发明公开
METHODS AND APPARATUS FOR SPARSE TENSOR STORAGE FOR NEURAL NETWORK ACCELERATORS 审中-公开

公开(公告)号：US20240134786A1

公开(公告)日：2024-04-25

申请号：US18539955

申请日：2023-12-14

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Bernard , Niall Hanrahan , Martin Power , Kevin Brady , Gary Baugh , Cormac Brick

IPC: G06F12/02 , G06N3/10

CPC classification number: G06F12/0207 , G06F12/0292 , G06N3/10

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.

14.

发明公开
FLOATING-POINT MULTIPLY-ACCUMULATE UNIT FACILITATING VARIABLE DATA PRECISIONS 审中-公开

公开(公告)号：US20230376274A1

公开(公告)日：2023-11-23

申请号：US18362529

申请日：2023-07-31

Applicant: Intel Corporation

Inventor： Mark Anders , Arnab Raha , Amit Agarwal , Steven Hsu , Deepak Abraham Mathaikutty , Ram K. Krishnamurthy , Martin Power

IPC: G06F7/544 , G06F7/487 , G06F7/485 , G06F5/01

CPC classification number: G06F7/5443 , G06F7/4876 , G06F7/485 , G06F5/012

Abstract: A fused dot-product multiply-accumulate (MAC) circuit may support variable precisions of floating-point data elements to perform computations (e.g., MAC operations) in deep learning operations. An operation mode of the circuit may be selected based on the precision of an input element. The operation mode may be a FP16 mode or a FP8 mode. In the FP8 mode, product exponents may be computed based on exponents of floating-point input elements. A maximum exponent may be selected from the one or more product exponents. A global maximum exponent may be selected from a plurality of maximum exponents. A product mantissa may be computed and aligned with another product mantissa based on a difference between the global maximum exponent and a corresponding maximum exponent. An adder tree may accumulate the aligned product mantissas and compute a partial sum mantissa. The partial sum mantissa may be normalized using the global maximum exponent.

15.

发明申请
DEEP NEURAL NETWORK (DNN) ACCELERATOR FACILITATING QUANTIZED INFERENCE 有权

公开(公告)号：US20230059976A1

公开(公告)日：2023-02-23

申请号：US18047415

申请日：2022-10-18

Applicant: Intel Corporation

Inventor： Deepak Abraham Mathaikutty , Arnab Raha , Raymond Jit-Hung Sung , Martin Power , Umer Iftikhar Cheema , David Thomas Bernard

IPC: G06N3/08

Abstract: An DNN accelerator may include a PE array performing MAC operations. The PE array may include PEs capable of MAC operations on quantized values. A PE may include subtractors for subtracting zeropoints from quantized activations and quantized weights to generate intermediate activations and intermediate weights. The intermediate activations and intermediate weights may be stored in data storage units in the PE and maybe used by an MAC unit in the PE. The subtractors may be placed outside the MAC unit but inside the PE. The MAC unit may perform sequential cycles of MAC operations. The MAC unit may include a plurality of multipliers. The intermediate activations and intermediate weights stored in the data storage units may be reused by different multipliers in different cycles of MAC operations. An output of the MAC unit or of the PE may be multiplied with a quantization scale to produce a floating-point value.

16.

发明申请
METHODS AND APPARATUS FOR SPARSE TENSOR STORAGE FOR NEURAL NETWORK ACCELERATORS 有权

公开(公告)号：US20210406164A1

公开(公告)日：2021-12-30

申请号：US17359217

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Bernard , Niall Hanrahan , Martin Power , Kevin Brady , Gary Baugh , Cormac Brick

IPC: G06F12/02 , G06N3/10

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.

17.

发明申请
METHODS AND APPARATUS TO PERFORM MACHINE-LEARNING MODEL OPERATIONS ON SPARSE ACCELERATORS 有权

公开(公告)号：US20210319317A1

公开(公告)日：2021-10-14

申请号：US17357924

申请日：2021-06-24

Applicant: Intel Corporation

Inventor： Martin Power , Kevin Brady , Niall Hanrahan , Martin-Thomas Grymel , David Bernard , Gary Baugh

IPC: G06N3/08 , G06N3/063

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to perform machine-learning model operations on sparse accelerators. An example apparatus includes first circuitry, second circuitry to generate sparsity data based on an acceleration operation, and third circuitry to instruct one or more data buffers to provide at least one of activation data or weight data based on the sparsity data to the first circuitry, the first circuitry to execute the acceleration operation based on the at least one of the activation data or the weight data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification