OPTIMIZING CONCURRENT EXECUTION USING NETWORKED PROCESSING UNITS

    公开(公告)号:US20230136612A1

    公开(公告)日:2023-05-04

    申请号:US18090749

    申请日:2022-12-29

    IPC分类号: G06F9/48 G06F9/54

    摘要: Various approaches for managing distributed compute operations for workload execution of concurrent tasks, including with the use of infrastructure processing units (IPUs) and similar networked processing units, are disclosed. An example method may include: identifying multiple tasks of a computing workload, for a workload that provides processing dependencies among the tasks, and that uses concurrent execution with one or more of the tasks; monitoring an execution time for each of the tasks, relative to an execution time threshold for each of the tasks; identifying the execution time of a particular task as exceeding an execution time threshold for the particular task; determining a remediation based on the particular task and the identified execution time, with the remediation including use of other compute resources in the distributed computing environment for the workload; and applying the remediation to increase speed of execution of the workload.

    MANAGEMENT OF WORKLOAD PROCESSING USING DISTRIBUTED NETWORKED PROCESSING UNITS

    公开(公告)号:US20230135645A1

    公开(公告)日:2023-05-04

    申请号:US18090764

    申请日:2022-12-29

    IPC分类号: G06F9/50

    摘要: Various approaches for deploying and controlling distributed compute operations with the use of infrastructure processing units (IPUs) and similar networked processing units are disclosed. A system that includes a networked processing unit may perform workload processing with operations that: receive, from another networked processing unit, workload information for a workload, for a workload having respective tasks to be processed among distributed computing entities; perform an analysis of network conditions for a predicted execution of the workload, based on the workload information, to analyze network availability among the distributed computing entities; perform an analysis of compute conditions for the predicted execution of the workload, based on the workload information, to analyze processing availability among the distributed computing entities; and identify locations of the distributed computing entities to deploy the workload, based on the analysis of network conditions and the analysis of compute conditions.

    MECHANISM FOR SECURE AND RESILIENT CONFIGURATION UPGRADES

    公开(公告)号:US20220012042A1

    公开(公告)日:2022-01-13

    申请号:US17484455

    申请日:2021-09-24

    摘要: Various systems and methods for providing secure and resilient configuration upgrades are described herein. A system, includes a processor; and memory to store instructions, which when executed by the processor, cause the system to: receive at a resilient security island (RSI) partition of a first network node, an update from a source, the first network node hosting the RSI partition and a host partition, the RSI comprising reserved hardware resources of the first network node; verify, by the RSI, provenance of the update; apply, by the RSI, the update to modify a configuration of the RSI or the host partition; test, by the RSI, the modified configuration of the RSI or the host partition; and provide a cryptographic proof that the test was completed and an update status to an update coordinator.

    DYNAMIC PARALLEL PROCESSING IN AN EDGE COMPUTING SYSTEM

    公开(公告)号:US20240126606A1

    公开(公告)日:2024-04-18

    申请号:US18397807

    申请日:2023-12-27

    IPC分类号: G06F9/50

    CPC分类号: G06F9/5027

    摘要: Data that is to be processed by a particular service executed by a first edge computing device in an application, is analyzed to determine characteristics of the data. An opportunity to replicate the particular service on a plurality of edge computing devices is determined based on characteristics of the data. A second edge computing device is determined to be available to execute a replicated instance of the particular service. Replication of the particular service is initiated on a plurality of edge computing devices including the second edge computing device. An output of an instance of the particular service executed on the first edge computing device and an output of the replicated instance of the particular service executed on the second edge computing device are combined to form a single output for the particular service.