-
公开(公告)号:US12169643B2
公开(公告)日:2024-12-17
申请号:US18465560
申请日:2023-09-12
Applicant: Intel Corporation
Inventor: Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
-
公开(公告)号:US20240134786A1
公开(公告)日:2024-04-25
申请号:US18539955
申请日:2023-12-14
Applicant: Intel Corporation
Inventor: Martin-Thomas Grymel , David Bernard , Niall Hanrahan , Martin Power , Kevin Brady , Gary Baugh , Cormac Brick
CPC classification number: G06F12/0207 , G06F12/0292 , G06N3/10
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.
-
公开(公告)号:US20210406164A1
公开(公告)日:2021-12-30
申请号:US17359217
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Martin-Thomas Grymel , David Bernard , Niall Hanrahan , Martin Power , Kevin Brady , Gary Baugh , Cormac Brick
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.
-
14.
公开(公告)号:US20210319317A1
公开(公告)日:2021-10-14
申请号:US17357924
申请日:2021-06-24
Applicant: Intel Corporation
Inventor: Martin Power , Kevin Brady , Niall Hanrahan , Martin-Thomas Grymel , David Bernard , Gary Baugh
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to perform machine-learning model operations on sparse accelerators. An example apparatus includes first circuitry, second circuitry to generate sparsity data based on an acceleration operation, and third circuitry to instruct one or more data buffers to provide at least one of activation data or weight data based on the sparsity data to the first circuitry, the first circuitry to execute the acceleration operation based on the at least one of the activation data or the weight data.
-
公开(公告)号:US10960880B2
公开(公告)日:2021-03-30
申请号:US15937931
申请日:2018-03-28
Applicant: Intel Corporation
Inventor: Kevin Brady , Jelle Sels , William Rafferty , Diarmaid O'Cualain , Keyssy Guerra Perez
Abstract: Herein is disclosed a slack distribution system comprising one or more sensors, configured to deliver sensor data to one or more processors in a first vehicle; a wireless communication circuit, configured to wirelessly transmit to a second vehicle; one or more processors, configured to determine from at least the sensor data, during first vehicle deceleration, a slack distance between the first vehicle and the second vehicle; and when the slack distance is less than a predetermined threshold, to cause the wireless communication circuit to transmit to the second vehicle a slack request message, wherein the slack request message is a request to change the slack distance.
-
16.
公开(公告)号:US20220108135A1
公开(公告)日:2022-04-07
申请号:US17554970
申请日:2021-12-17
Applicant: Intel Corporation
Inventor: Kevin Brady , Martin Power , Martin-Thomas Grymel , Alessandro Palla , David Bernard , Niall Hanrahan
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for performing a machine learning operation using storage element pointers. An example computer readable medium comprises instructions that when executed, cause at least one processor to select, in response to a determination that a machine learning operation is to be performed, create first and second storage element pointers based on a type of machine learning operation to be performed, remap input tensor data of the input tensor based on the first storage element pointer without movement of the input tensor data in memory, cause execution of the machine learning operation with the remapped input tensor data to create intermediate tensor data, remap the intermediate tensor data based on the second storage element pointer without movement of the intermediate tensor data in memory, and provide the remapped intermediate tensor data as an output tensor.
-
公开(公告)号:US20220012164A1
公开(公告)日:2022-01-13
申请号:US17483521
申请日:2021-09-23
Applicant: Intel Corporation
Inventor: Martin-Thomas Grymel , David Bernard , Martin Power , Niall Hanrahan , Kevin Brady
IPC: G06F11/36 , G06F11/30 , G06F11/277 , G06N3/04
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.
-
公开(公告)号:US20220012058A1
公开(公告)日:2022-01-13
申请号:US17484780
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
-
公开(公告)号:US20190047564A1
公开(公告)日:2019-02-14
申请号:US15937931
申请日:2018-03-28
Applicant: Intel Corporation
Inventor: Kevin Brady , Jelle Sels , William Rafferty , Diarmaid O'Cualain , Keyssy Guerra Perez
Abstract: Herein is disclosed a slack distribution system comprising one or more sensors, configured to deliver sensor data to one or more processors in a first vehicle; a wireless communication circuit, configured to wirelessly transmit to a second vehicle; one or more processors, configured to determine from at least the sensor data, during first vehicle deceleration, a slack distance between the first vehicle and the second vehicle; and when the slack distance is less than a predetermined threshold, to cause the wireless communication circuit to transmit to the second vehicle a slack request message, wherein the slack request message is a request to change the slack distance.
-
-
-
-
-
-
-
-