-
公开(公告)号:US12288141B2
公开(公告)日:2025-04-29
申请号:US17358654
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Yamini Nimmagadda , Mustafa Cavus , Surya Siddharth Pemmaraju , Srinivasa Manohar Karlapalem
IPC: G06N20/00 , G06F12/0811 , G06N5/04
Abstract: Systems, apparatuses and methods provide technology for model generation with intermediate stage caching and re-use, including generating, via a model pipeline, a multi-level set of intermediate stages for a model, caching each of the set of intermediate stages, and responsive to a change in the model pipeline, regenerating an executable for the model using a first one of the cached intermediate stages to bypass regeneration of at least one of the intermediate stages. The multi-level set of intermediate stages can correspond to a hierarchy of processing stages in the model pipeline, where using the first one of the cached intermediate stages results in bypassing regeneration of a corresponding intermediate stage and of all intermediate stages preceding the corresponding intermediate stage in the hierarchy. Further, regenerating an executable for the model can include regenerating one or more intermediate stages following the corresponding intermediate stage in the hierarchy.
-
2.
公开(公告)号:US12086290B2
公开(公告)日:2024-09-10
申请号:US17406755
申请日:2021-08-19
Applicant: Intel Corporation
Inventor: Yamini Nimmagadda , Akhila Vidiyala , Suryaprakash Shanmugam
CPC classification number: G06F21/64 , G06F8/41 , G06N5/04 , H04L9/3236 , H04L9/3247
Abstract: Systems, apparatuses and methods include technology that generates a signature based on one or more characteristics of an artificial intelligence (AI) model. The AI model is in a source code. The technology generates a compiled blob based on the AI model and embeds an identifier based on the signature into a metadata field of the compiled blob.
-
公开(公告)号:US11640326B2
公开(公告)日:2023-05-02
申请号:US17392917
申请日:2021-08-03
Applicant: Intel Corporation
Inventor: N Maajid Khan , Yamini Nimmagadda , Surya Siddharth Pemmaraju
Abstract: Systems, apparatuses and methods may provide for technology that identifies telemetry data associated with an execution of a cluster of artificial intelligence (AI) operations on an accelerated backend system, wherein the telemetry data includes one or more of temperature classifier data, compute classifier data or failure data, and determines whether to send a current instance of the cluster of AI operations to the accelerated backend system or a default backend system based on the telemetry data.
-
公开(公告)号:US20210319298A1
公开(公告)日:2021-10-14
申请号:US17357340
申请日:2021-06-24
Applicant: Intel Corporation
Abstract: Systems, apparatuses and methods provide technology for efficient subgraph partitioning, including generating a first set of subgraphs based on supported nodes of a model graph, wherein the supported nodes have operators that are supported by a hardware backend device, evaluating a compute efficiency of each subgraph of the first set of subgraphs with respect to the hardware backend device and to a default CPU associated with a default runtime, and selecting, from the first set of subgraphs, a second set of subgraphs to be run on the hardware backend device based on the evaluated compute efficiency. The technology can include calculating a backend performance factor for each subgraph for the hardware backend device, calculating a default performance factor for each subgraph for the default CPU, and comparing, for each respective subgraph of the of the first set of subgraphs, the backend performance factor and the default performance factor.
-
公开(公告)号:US12242973B2
公开(公告)日:2025-03-04
申请号:US17402114
申请日:2021-08-13
Applicant: Intel Corporation
Inventor: Chandrakant Khandelwal , Ritesh Kumar Rajore , Laxmi Ganesan , Sai Jayanthi , Yamini Nimmagadda
IPC: G06N3/10 , G06F8/41 , G06F9/445 , G06F16/901
Abstract: Systems, apparatuses and methods may provide for technology that parses, at runtime, a deep learning graph in topological order to identify a plurality of nodes, marks a first set of nodes in the plurality of nodes as unsupported by target hardware, and marks a second set of nodes in the plurality of nodes as supported by the target hardware, wherein the first set of nodes and the second set of nodes are marked based on one or more attributes defining operation functionality, and wherein the one or more attributes include one or more of an input node parameter, a dimension, or a shape.
-
公开(公告)号:US12182616B2
公开(公告)日:2024-12-31
申请号:US17484099
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Susanne M. Balle , Yamini Nimmagadda , Olugbemisola Oniyinde
Abstract: A platform health engine for autonomous self-healing in platforms served by an Infrastructure Processing Unit (IPU), including: an analysis processor configured to apply analytics to telemetry data received from a telemetry agent of a monitored platform managed by the IPU, and to generate relevant platform health data; a prediction processor configured to predict, based on the relevant platform health data, a future health status of the monitored platform; and a dispatch processor configured to dispatch a workload of the monitored platform to another platform managed if the predicted future health status of the monitored platform is failure.
-
公开(公告)号:US12106154B2
公开(公告)日:2024-10-01
申请号:US17406711
申请日:2021-08-19
Applicant: Intel Corporation
Inventor: Yamini Nimmagadda , Akhila Vidiyala , Suryaprakash Shanmugam , Divya Prakash
CPC classification number: G06F9/505 , G06F9/5016 , G06F9/5072 , G06N3/08
Abstract: Systems, apparatuses and methods include technology that analyzes an input stream and an artificial intelligence (AI) model graph to generate a workload characterization. The workload characterization characterizes one or more of compute resources or memory resources, and the one or more of the compute resources or the memory resources is associated with execution of the AI model graph based on the input stream. The technology partitions the AI model graph into subgraphs based on the workload characterization. The technology selects a plurality of hardware devices to execute the subgraphs.
-
公开(公告)号:US20210390460A1
公开(公告)日:2021-12-16
申请号:US17459141
申请日:2021-08-27
Applicant: Intel Corporation
Inventor: Yamini Nimmagadda , Suryaprakash Shanmugam , Akhila Vidiyala , Divya Prakash
Abstract: Systems, apparatuses and methods include technology that converts an artificial intelligence (AI) model graph into an intermediate representation. The technology partitions the intermediate representation of the AI model graph into a plurality of subgraphs based on computations associated with the AI model graph, each subgraph being associated with one or more memory resources and one or more of a plurality of hardware devices. The technology determines whether to readjust the plurality of subgraphs based on the memory resources associated with the plurality of subgraphs and memory capacities of the plurality of hardware devices
-
公开(公告)号:US20210318908A1
公开(公告)日:2021-10-14
申请号:US17358751
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Mustafa Cavus , Yamini Nimmagadda
IPC: G06F9/48 , G06F16/901 , G06N3/08 , G06N3/04
Abstract: Systems, apparatuses and methods provide technology for batch-level parallelism, including partitioning a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data, establishing an execution queue for execution of the plurality of clusters based on cluster dependencies, and scheduling inference execution of the plurality of clusters in the execution queue based on batch size. The technology can include identifying nodes of the graph as batched or non-batched, generating a batched cluster comprising a plurality of batched nodes based on a relationship between two or more of the batched nodes, and generating a non-batched cluster comprising a plurality of non-batched nodes based on a relationship between two or more of the non-batched nodes. The technology can also include generating a set of cluster dependencies, where the cluster dependencies are used to determine an execution order for the clusters.
-
公开(公告)号:US11941437B2
公开(公告)日:2024-03-26
申请号:US17358751
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Mustafa Cavus , Yamini Nimmagadda
IPC: G06F9/50 , G06F9/48 , G06F16/901 , G06N3/04 , G06N3/08
CPC classification number: G06F9/4881 , G06F9/5038 , G06F16/9024 , G06N3/04 , G06N3/08
Abstract: Systems, apparatuses and methods provide technology for batch-level parallelism, including partitioning a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data, establishing an execution queue for execution of the plurality of clusters based on cluster dependencies, and scheduling inference execution of the plurality of clusters in the execution queue based on batch size. The technology can include identifying nodes of the graph as batched or non-batched, generating a batched cluster comprising a plurality of batched nodes based on a relationship between two or more of the batched nodes, and generating a non-batched cluster comprising a plurality of non-batched nodes based on a relationship between two or more of the non-batched nodes. The technology can also include generating a set of cluster dependencies, where the cluster dependencies are used to determine an execution order for the clusters.
-
-
-
-
-
-
-
-
-