-
公开(公告)号:US11080611B2
公开(公告)日:2021-08-03
申请号:US15853457
申请日:2017-12-22
Applicant: Intel Corporation
Inventor: Ajit Singh , Bharat Daga , Michael Behar
Abstract: Embodiments described herein provide a processing apparatus comprising compute logic to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute logic additionally includes a direct memory access (DMA) controller including a hardware codec having an encode unit and a decode unit, the DMA controller to read the neural network data from the memory buffer, encode the neural network data via the encode unit, write encoded neural network data to a memory device coupled with the processing apparatus, write metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decode encoded neural network data via the decode unit in response to a request from the compute logic.
-
12.
公开(公告)号:US10762685B2
公开(公告)日:2020-09-01
申请号:US16670749
申请日:2019-10-31
Applicant: Intel Corporation
Inventor: Uzi Sarel , Ehud Cohen , Tomer Schwartz , Amitai Armon , Yahav Shadmiy , Itamar Ben-Ari , Amit Bleiweiss , Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Michael Behar , Guy Jacob , Gal Leibovich , Jeremie Dreyfuss
Abstract: In an example, an apparatus comprises a plurality of execution units; and logic, at least partially including hardware logic, to determine a sub-graph of a network that can be executed in a frequency domain and apply computations in the sub-graph in the frequency domain. Other embodiments are also disclosed and claimed.
-
13.
公开(公告)号:US20200143579A1
公开(公告)日:2020-05-07
申请号:US16670749
申请日:2019-10-31
Applicant: Intel Corporation
Inventor: Uzi Sarel , Ehud Cohen , Tomer Schwartz , Amitai Armon , Yahav Shadmiy , Itamar Ben-Ari , Amit Bleiweiss , Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Michael Behar , Guy Jacob , Gal Leibovich , Jeremie Dreyfuss
Abstract: In an example, an apparatus comprises a plurality of execution units; and logic, at least partially including hardware logic, to determine a sub-graph of a network that can be executed in a frequency domain and apply computations in the sub-graph in the frequency domain. Other embodiments are also disclosed and claimed.
-
14.
公开(公告)号:US20180293777A1
公开(公告)日:2018-10-11
申请号:US15482724
申请日:2017-04-08
Applicant: Intel Corporation
Inventor: Uzi Sarel , Ehud Cohen , Tomer Schwartz , Amitai Armon , Yahav Shadmiy , Itamar Ben-Ari , Amit Bleiweiss , Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Michael Behar , Guy Jacob , Gal Leibovich , Jeremie Dreyfuss
Abstract: In an example, an apparatus comprises a plurality of execution units; and logic, at least partially including hardware logic, to determine a sub-graph of a network that can be executed in a frequency domain and apply computations in the sub-graph in the frequency domain. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20250086445A1
公开(公告)日:2025-03-13
申请号:US18888744
申请日:2024-09-18
Applicant: Intel Corporation
Inventor: Ehud Cohen , Moshe Maor , Ashutosh Parkhi , Michael Behar , Yaniv Fais
Abstract: A convolutional neural network (CNN) accelerator, including: a CNN circuit for performing a multiple-layer CNN computation, wherein the multiple layers are to receive an input feature according to an input feature map (IFM) and a weight matrix per output feature, wherein an output of a first layer provides an input for a next layer; and a mapping circuit to access a three-dimensional input matrix stored as a Z-major matrix; wherein the CNN circuit is to perform an inner-product direct convolution on the Z-major matrix, wherein the direct convolution lacks a lowering operation.
-
公开(公告)号:US12131250B2
公开(公告)日:2024-10-29
申请号:US15720982
申请日:2017-09-29
Applicant: Intel Corporation
Inventor: Ehud Cohen , Moshe Maor , Ashutosh Parkhi , Michael Behar , Yaniv Fais
CPC classification number: G06N3/063 , G06F16/17 , G06F18/21 , G06N3/045 , G06N3/08 , G06V10/454 , G06V10/82 , G06V10/955
Abstract: A convolutional neural network (CNN) accelerator, including: a CNN circuit for performing a multiple-layer CNN computation, wherein the multiple layers are to receive an input feature according to an input feature map (IFM) and a weight matrix per output feature, wherein an output of a first layer provides an input for a next layer; and a mapping circuit to access a three-dimensional input matrix stored as a Z-major matrix; wherein the CNN circuit is to perform an inner-product direct convolution on the Z-major matrix, wherein the direct convolution lacks a lowering operation.
-
公开(公告)号:US20240112033A1
公开(公告)日:2024-04-04
申请号:US18514069
申请日:2023-11-20
Applicant: Intel Corporation
Inventor: Amit Bleiweiss , Itamar Ben-Ari , Michael Behar , Guy Jacob , Gal Leibovich , Jacob Subag , Lev Faivishevsky , Yaniv Fais , Tomer Schwartz
CPC classification number: G06N3/082 , G06F8/52 , G06F9/44552 , G06N3/04 , G06N3/105 , G06N5/04 , G06N3/084
Abstract: In an example, an apparatus comprises at least one execution platform; and logic, at least partially including hardware logic, to receive a trained neural network model in a model optimizer and convert the trained neural network model to an optimized model comprising parameters that are fit to the at least one execution platform. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20230394305A1
公开(公告)日:2023-12-07
申请号:US18325744
申请日:2023-05-30
Applicant: Intel Corporation
Inventor: Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Jeremie Dreyfuss , Amit Bleiweiss , Tomer Schwartz , Raanan Yonatan Yehezkel Rohekar , Michael Behar , Amitai Armon , Uzi Sarel
Abstract: In an example, an apparatus comprises a plurality of execution units comprising and logic, at least partially including hardware logic, to receive a plurality of data inputs for training a neural network, wherein the data inputs comprise training data and weights inputs; represent the data inputs in a first form; and represent the weight inputs in a second form. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20230333913A1
公开(公告)日:2023-10-19
申请号:US18309650
申请日:2023-04-28
Applicant: INTEL CORPORATION
Inventor: Michael Behar , Moshe Maor , Ronen Gabbai , Roni Rosner , Zigi Walter , Oren Agam
IPC: G06F9/50 , G06F16/901 , G06N3/044 , G06N3/045
CPC classification number: G06F9/5083 , G06F16/9024 , G06N3/044 , G06N3/045
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to configure heterogenous components in an accelerator. An example apparatus includes a graph compiler to identify a workload node in a workload and generate a selector for the workload node, and the selector to identify an input condition and an output condition of a compute building block, wherein the graph compiler is to, in response to obtaining the identified input condition and output condition from the selector, map the workload node to the compute building block.
-
20.
公开(公告)号:US20210049804A1
公开(公告)日:2021-02-18
申请号:US17006253
申请日:2020-08-28
Applicant: Intel Corporation
Inventor: Uzi Sarel , Ehud Cohen , Tomer Schwartz , Amitai Armon , Yahav Shadmiy , Itamar Ben-Ari , Amit Bleiweiss , Lev Faivishevsky , Tomer Bar-On , Yaniv Fais , Jacob Subag , Michael Behar , Guy Jacob , Gal Leibovich , Jeremie Dreyfuss
Abstract: In an example, an apparatus comprises a plurality of execution units; and logic, at least partially including hardware logic, to determine a sub-graph of a network that can be executed in a frequency domain and apply computations in the sub-graph in the frequency domain. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-