Patent search ap:("Intel Corporation") AND inv:"Martin-Thomas Grymel" Page 1

1.

发明申请
HALO TRANSFER FOR CONVOLUTION WORKLOAD PARTITION 有权

公开(公告)号：US20230116629A1

公开(公告)日：2023-04-13

申请号：US18046256

申请日：2022-10-13

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Thomas Bernard , Niall Hanrahan

IPC: G06N3/048 , G06F9/50

Abstract: A DNN accelerator includes multiple compute tiles for sharing a workload of running a convolution. A halo pipeline in a compute tile can facilitate replications of halo data from the compute tile where the halo data is generated into another compute tile. The halo pipeline may receive a memory transaction for writing a data block. The halo pipeline may determine that the data block falls into a halo region in an input tensor of the convolution. The halo pipeline may generate a remote address for storing the data block in a memory of the other compute tile, e.g., based on a local address of the data block in a memory of the compute tile. The halo pipeline may adjust the remote address, e.g., based on a difference in dimensions of a tensor to be used by the compute tile and a tensor to be used by the other compute tile.

2.

发明申请
METHODS AND APPARATUS FOR PERFORMING A MACHINE LEARNING OPERATION USING STORAGE ELEMENT POINTERS 有权

公开(公告)号：US20220108135A1

公开(公告)日：2022-04-07

申请号：US17554970

申请日：2021-12-17

Applicant: Intel Corporation

Inventor： Kevin Brady , Martin Power , Martin-Thomas Grymel , Alessandro Palla , David Bernard , Niall Hanrahan

IPC: G06K9/62 , G06N3/04

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for performing a machine learning operation using storage element pointers. An example computer readable medium comprises instructions that when executed, cause at least one processor to select, in response to a determination that a machine learning operation is to be performed, create first and second storage element pointers based on a type of machine learning operation to be performed, remap input tensor data of the input tensor based on the first storage element pointer without movement of the input tensor data in memory, cause execution of the machine learning operation with the remapped input tensor data to create intermediate tensor data, remap the intermediate tensor data based on the second storage element pointer without movement of the intermediate tensor data in memory, and provide the remapped intermediate tensor data as an output tensor.

3.

发明申请
SYSTEMS, APPARATUS, AND METHODS TO DEBUG ACCELERATOR HARDWARE 有权

公开(公告)号：US20220012164A1

公开(公告)日：2022-01-13

申请号：US17483521

申请日：2021-09-23

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Bernard , Martin Power , Niall Hanrahan , Kevin Brady

IPC: G06F11/36 , G06F11/30 , G06F11/277 , G06N3/04

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.

4.

发明申请
METHODS, APPARATUS, AND ARTICLES OF MANUFACTURE TO INCREASE DATA REUSE FOR MULTIPLY AND ACCUMULATE (MAC) OPERATIONS 有权

公开(公告)号：US20220012058A1

公开(公告)日：2022-01-13

申请号：US17484780

申请日：2021-09-24

Applicant: Intel Corporation

Inventor： Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick

IPC: G06F9/30 , G06F9/38 , G06F7/544

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.

5.

发明公开
SYSTEMS, APPARATUS, AND METHODS TO DEBUG ACCELERATOR HARDWARE 审中-公开

公开(公告)号：US20240118992A1

公开(公告)日：2024-04-11

申请号：US18487490

申请日：2023-10-16

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Bernard , Martin Power , Niall Hanrahan , Kevin Brady

IPC: G06F11/36 , G06F11/277 , G06F11/30 , G06N3/04

CPC classification number: G06F11/3652 , G06F11/277 , G06F11/3075 , G06F11/3656 , G06N3/04

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.

6.

发明授权
Methods and apparatus for sparse tensor storage for neural network accelerators 有权

公开(公告)号：US11940907B2

公开(公告)日：2024-03-26

申请号：US17359217

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Bernard , Niall Hanrahan , Martin Power , Kevin Brady , Gary Baugh , Cormac Brick

IPC: G06F12/00 , G06F12/02 , G06N3/10

CPC classification number: G06F12/0207 , G06F12/0292 , G06N3/10

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.

7.

发明授权
Methods, apparatus, and articles of manufacture to increase data reuse for multiply and accumulate (MAC) operations 有权

公开(公告)号：US11789646B2

公开(公告)日：2023-10-17

申请号：US17484780

申请日：2021-09-24

Applicant: Intel Corporation

Inventor： Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick

IPC: G06F3/06 , G06F7/544

CPC classification number: G06F3/0656 , G06F3/0613 , G06F3/0625 , G06F3/0679 , G06F7/5443

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.

8.

发明公开
METHODS, APPARATUS, AND ARTICLES OF MANUFACTURE TO INCREASE DATA REUSE FOR MULTIPLY AND ACCUMULATE (MAC) OPERATIONS 审中-公开

公开(公告)号：US20240036763A1

公开(公告)日：2024-02-01

申请号：US18465560

申请日：2023-09-12

Applicant: Intel Corporation

Inventor： Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick

IPC: G06F3/06 , G06F7/544

CPC classification number: G06F3/0656 , G06F7/5443 , G06F3/0625 , G06F3/0679 , G06F3/0613

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.

9.

发明授权
Systems, apparatus, and methods to debug accelerator hardware 有权

公开(公告)号：US11829279B2

公开(公告)日：2023-11-28

申请号：US17483521

申请日：2021-09-23

Applicant: Intel Corporation

Inventor： Martin-Thomas Grymel , David Bernard , Martin Power , Niall Hanrahan , Kevin Brady

IPC: G06F11/00 , G06F11/36 , G06N3/04 , G06F11/277 , G06F11/30

CPC classification number: G06F11/3652 , G06F11/277 , G06F11/3075 , G06F11/3656 , G06N3/04

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.

10.

发明申请
METHODS, APPARATUS, AND ARTICLES OF MANUFACTURE TO INCREASE UTILIZATION OF NEURAL NETWORK (NN) ACCELERATOR CIRCUITRY FOR SHALLOW LAYERS OF AN NN BY REFORMATTING ONE OR MORE TENSORS 有权

公开(公告)号：US20220012578A1

公开(公告)日：2022-01-13

申请号：US17484661

申请日：2021-09-24

Applicant: Intel Corporation

Inventor： Kevin Brady , Martin Power , Niall Hanrahan , Alessandro Palla , Martin-Thomas Grymel , David Bernard

IPC: G06N3/063

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase utilization of neural network (NN) accelerator circuitry for shallow layers of an NN by reformatting one or more tensors. An example apparatus includes parameter determining circuitry to determine a width of a weight kernel and to determine a depth of a first tensor. The example apparatus also includes storage control circuitry to, starting at a first XY location of the first tensor, copy one or more Z values, up to the depth of the first tensor, of consecutive XY locations that overlap the width of the weight kernel and to load the one or more Z values consecutively in a first XY location of a second tensor.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification