-
公开(公告)号:US20240127031A1
公开(公告)日:2024-04-18
申请号:US18394307
申请日:2023-12-22
Applicant: Intel Corporation
Inventor: Hamza Yous , Ian Hunter , Alessandro Palla
Abstract: A graph neural network (GNN) model is used in a scheduling process for compiling a deep neural network (DNN). The DNN, and parameter options for scheduling the DNN, are represented as a graph, and the GNN predicts a set of parameters that is expected to have a low cost. Using the GNN-based model, a compiler can produce a schedule for compiling the DNN in a relatively short and predictable amount of time, even for DNNs with many layers and/or many parameter options. For example, the GNN-based model reduces the overhead of exploring every parameter combination and does not exclude combinations from consideration like prior heuristic-based approaches.
-
2.
公开(公告)号:US20220108135A1
公开(公告)日:2022-04-07
申请号:US17554970
申请日:2021-12-17
Applicant: Intel Corporation
Inventor: Kevin Brady , Martin Power , Martin-Thomas Grymel , Alessandro Palla , David Bernard , Niall Hanrahan
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for performing a machine learning operation using storage element pointers. An example computer readable medium comprises instructions that when executed, cause at least one processor to select, in response to a determination that a machine learning operation is to be performed, create first and second storage element pointers based on a type of machine learning operation to be performed, remap input tensor data of the input tensor based on the first storage element pointer without movement of the input tensor data in memory, cause execution of the machine learning operation with the remapped input tensor data to create intermediate tensor data, remap the intermediate tensor data based on the second storage element pointer without movement of the intermediate tensor data in memory, and provide the remapped intermediate tensor data as an output tensor.
-
公开(公告)号:US20220391710A1
公开(公告)日:2022-12-08
申请号:US17820593
申请日:2022-08-18
Applicant: Intel Corporation
Inventor: Alessandro Palla , Ian Frederick Hunter , Richard Richmond , Cormac Brick , Sebastian Eusebiu Nagy
Abstract: Systems, apparatuses and methods may provide for technology that determines a complexity of a task associated with a neural network workload and generates a hardware efficiency estimate for the task, wherein the hardware efficiency estimate is generated via a neural network based cost model if the complexity exceeds a threshold, and wherein the hardware efficiency estimate is generated via a cost function if the complexity does not exceed the threshold. In one example, the technology trains the neural network based cost model based on one or more of hardware profile data or register-transfer level (RTL) data.
-
公开(公告)号:US20230004430A1
公开(公告)日:2023-01-05
申请号:US17856968
申请日:2022-07-02
Applicant: Intel Corporation
Inventor: Richard Richmond , Eric Luk , Lingdan Zeng , Lance Hacking , Alessandro Palla , Mohamed Elmalaki , Sara Almalih
Abstract: Technology for estimating neural network (NN) power profiles includes obtaining a plurality of workloads for a compiled NN model, the plurality of workloads determined for a hardware execution device, determining a hardware efficiency factor for the compiled NN model, and generating, based on the hardware efficiency factor, a power profile for the compiled NN model on one or more of a per-layer basis or a per-workload basis. The hardware efficiency factor can be determined on based on a hardware efficiency measurement and a hardware utilization measurement, and can be determined on a per-workload basis. A configuration file can be provided for generating the power profile, and an output visualization of the power profile can be generated. Further, feedback information can be generated to perform one or more of selecting a hardware device, optimizing a breakdown of workloads, optimizing a scheduling of tasks, or confirming a hardware device design.
-
公开(公告)号:US20220012578A1
公开(公告)日:2022-01-13
申请号:US17484661
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Kevin Brady , Martin Power , Niall Hanrahan , Alessandro Palla , Martin-Thomas Grymel , David Bernard
IPC: G06N3/063
Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that increase utilization of neural network (NN) accelerator circuitry for shallow layers of an NN by reformatting one or more tensors. An example apparatus includes parameter determining circuitry to determine a width of a weight kernel and to determine a depth of a first tensor. The example apparatus also includes storage control circuitry to, starting at a first XY location of the first tensor, copy one or more Z values, up to the depth of the first tensor, of consecutive XY locations that overlap the width of the weight kernel and to load the one or more Z values consecutively in a first XY location of a second tensor.
-
-
-
-