-
公开(公告)号:US20240403616A1
公开(公告)日:2024-12-05
申请号:US18500229
申请日:2023-11-02
Applicant: Intel Corporation
Inventor: Umer Iftikhar Cheema , Kevin Brady , Robert Simofi , Colm O Faolain , Deepak Abraham Mathaikutty , Arnab Raha , Dinakar Kondru , Gary Baugh , Darren Crews , Fergal Connor
IPC: G06N3/048
Abstract: An activation function in a neural network may be approximated by one or more linear functions. A linear function may correspond to a segment of the input range of the activation function, e.g., a linear segment. A programmable look-up table may store slopes and intercepts of linear functions. A post processing engine (PPE) array executing the activation function may determine that an input data element of the activation function falls into the linear segment and compute an output of the linear function using the input data element. The output of the linear function may be used as the approximated output of the activation function. Alternatively, the PPE array may determine that the input data element is in a saturation segment and use a fixed value associated with the saturation segment as the approximated output of the activation function.
-
2.
公开(公告)号:US20240036763A1
公开(公告)日:2024-02-01
申请号:US18465560
申请日:2023-09-12
Applicant: Intel Corporation
Inventor: Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick
CPC classification number: G06F3/0656 , G06F7/5443 , G06F3/0625 , G06F3/0679 , G06F3/0613
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
-
公开(公告)号:US11829279B2
公开(公告)日:2023-11-28
申请号:US17483521
申请日:2021-09-23
Applicant: Intel Corporation
Inventor: Martin-Thomas Grymel , David Bernard , Martin Power , Niall Hanrahan , Kevin Brady
IPC: G06F11/00 , G06F11/36 , G06N3/04 , G06F11/277 , G06F11/30
CPC classification number: G06F11/3652 , G06F11/277 , G06F11/3075 , G06F11/3656 , G06N3/04
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.
-
公开(公告)号:US20230229910A1
公开(公告)日:2023-07-20
申请号:US17937592
申请日:2022-10-03
Applicant: Intel Corporation
Inventor: Kevin Brady , Sudheendra Kadri , Niall Hanrahan
CPC classification number: G06N3/08 , G06N3/0481 , G06F13/28 , G06F2213/28
Abstract: A compute block includes a DMA engine that reads data from an external memory and write the data into a local memory of the compute block. An MAC array in the compute block may use the data to perform convolutions. The external memory may store weights of one or more filters in a memory layout that comprises a sequence of sections for each filter. Each section may correspond to a channel of the filter and may store all the weights in the channel. The DMA engine may convert the memory layout to a different memory layout, which includes a sequence of new sections for each filter. Each new section may include a weight vector that includes a sequence of weights, each of which is from a different channel. The DMA engine may also compress the weights, e.g., by removing zero valued weights, before the conversion of the memory layout.
-
公开(公告)号:US20220012578A1
公开(公告)日:2022-01-13
申请号:US17484661
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Kevin Brady , Martin Power , Niall Hanrahan , Alessandro Palla , Martin-Thomas Grymel , David Bernard
IPC: G06N3/063
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase utilization of neural network (NN) accelerator circuitry for shallow layers of an NN by reformatting one or more tensors. An example apparatus includes parameter determining circuitry to determine a width of a weight kernel and to determine a depth of a first tensor. The example apparatus also includes storage control circuitry to, starting at a first XY location of the first tensor, copy one or more Z values, up to the depth of the first tensor, of consecutive XY locations that overlap the width of the weight kernel and to load the one or more Z values consecutively in a first XY location of a second tensor.
-
公开(公告)号:US20240118992A1
公开(公告)日:2024-04-11
申请号:US18487490
申请日:2023-10-16
Applicant: Intel Corporation
Inventor: Martin-Thomas Grymel , David Bernard , Martin Power , Niall Hanrahan , Kevin Brady
IPC: G06F11/36 , G06F11/277 , G06F11/30 , G06N3/04
CPC classification number: G06F11/3652 , G06F11/277 , G06F11/3075 , G06F11/3656 , G06N3/04
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to debug a hardware accelerator such as a neural network accelerator for executing Artificial Intelligence computational workloads. An example apparatus includes a core with a core input and a core output to execute executable code based on a machine-learning model to generate a data output based on a data input, and debug circuitry coupled to the core. The debug circuitry is configured to detect a breakpoint associated with the machine-learning model, compile executable code based on at least one of the machine-learning model or the breakpoint. In response to the triggering of the breakpoint, the debug circuitry is to stop the execution of the executable code and output data such as the data input, data output and the breakpoint for debugging the hardware accelerator.
-
公开(公告)号:US11940907B2
公开(公告)日:2024-03-26
申请号:US17359217
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Martin-Thomas Grymel , David Bernard , Niall Hanrahan , Martin Power , Kevin Brady , Gary Baugh , Cormac Brick
CPC classification number: G06F12/0207 , G06F12/0292 , G06N3/10
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.
-
公开(公告)号:US11789646B2
公开(公告)日:2023-10-17
申请号:US17484780
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Niall Hanrahan , Martin Power , Kevin Brady , Martin-Thomas Grymel , David Bernard , Gary Baugh , Cormac Brick
CPC classification number: G06F3/0656 , G06F3/0613 , G06F3/0625 , G06F3/0679 , G06F7/5443
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase data reuse for multiply and accumulate (MAC) operations. An example apparatus includes a MAC circuit to process a first context of a set of a first type of contexts stored in a first buffer and a first context of a set of a second type of contexts stored in a second buffer. The example apparatus also includes control logic circuitry to, in response to determining that there is an additional context of the second type to be processed in the set of the second type of contexts, maintain the first context of the first type in the first buffer. The control logic circuitry is also to, in response to determining that there is an additional context of the first type to be processed in the set of the first type of contexts maintain the first context of the second type in the second buffer and iterate a pointer of the second buffer from a first position to a next position in the second buffer.
-
公开(公告)号:US10700782B2
公开(公告)日:2020-06-30
申请号:US15869525
申请日:2018-01-12
Applicant: Intel Corporation
Inventor: Kevin Brady , Ruben Trejo Calle , William Anthony Rafferty , Diarmaid O'Cualain , Jelle Sels , David Relihan , Tommy Grealy , Michael O'Reilly , Keissy Guerra Perez
IPC: B60Q1/44 , B60Q1/52 , G08G1/16 , H04B10/116
Abstract: Various systems and methods for implementing an anti-collision mechanism are described herein. A system for a lead vehicle to provide a visible light communication (VLC) message to a trailing vehicle behind the lead vehicle includes a vehicle controller subsystem of the lead vehicle, to: receive from a sensor array interface, sensor data from a forward-facing sensor incorporated into the lead vehicle; determine, using a processor, from the sensor data that a hazard exists; initiate the application of brakes with a braking force; and initiate, via a light controller, a VLC message to the trailing vehicle, the VLC message including the braking force.
-
公开(公告)号:US10269239B2
公开(公告)日:2019-04-23
申请号:US15661801
申请日:2017-07-27
Applicant: Intel Corporation
Inventor: Kevin Brady , Ruben Trejo Calle , Jelle Sels , Diarmaid O'Cualain , Aziz Bahri
IPC: G08G1/005
Abstract: Apparatuses, methods and storage media associated with controlling a pedestrian crossing or traffic light are disclosed herein. In embodiments, an apparatus may include a control unit to extend a duration of a pedestrian crossing state of the pedestrian crossing or traffic light in response to receipt of sensor data that convey detection of at least one commence crossing event of the pedestrian, while the pedestrian crossing or traffic light is in a pedestrian crossing state, but yet to receive sensor data that convey detection of all corresponding end of crossing event or events of the one or more pedestrians, prior to expiration of the duration of the pedestrian crossing state. The controller may extend the duration of the pedestrian crossing state until receipt of sensor data that convey receipt of all corresponding end of crossing event/events of the one or more pedestrians, or until a timeout threshold is reached.
-
-
-
-
-
-
-
-
-