SERVICE-LEVEL OBJECTIVE (SLO) AWARE EXECUTION OF CONCURRENCE INFERENCE REQUESTS ON A FOG-CLOUD NETWORK

    公开(公告)号:EP4398102A1

    公开(公告)日:2024-07-10

    申请号:EP23217957.2

    申请日:2023-12-19

    IPC分类号: G06F9/50

    摘要: Cloud and Fog computing are complementary technologies used for complex Internet of Things (IoT) based deployment of applications. With an increase in the number of internet-connected devices, the volume of data generated and processed at higher speeds has increased substantially. Serving a large amount of data and workloads for predictive decisions in real-time using fog computing without Service-Level Objective (SLO) violation is a challenge. Present disclosure provides systems and method for inference management wherein a suitable execution workflow is automatically generated to execute machine learning (ML)/deep learning (DL) inference requests using fog with various type of instances (e.g., Function-as-a-Service (FaaS) instance, Machine Learning-as-a-service (MLaaS) instance, and the like) provided by cloud vendors/platforms. Generated workflow minimizes the cost of deployment as well as SLO violations.

    TECHNIQUE FOR DETERMINING A LOAD OF AN APPLICATION

    公开(公告)号:EP3408742A1

    公开(公告)日:2018-12-05

    申请号:EP16702504.8

    申请日:2016-01-26

    IPC分类号: G06F9/50 G06F9/54 H04L29/08

    摘要: A technique for determining of a load of an application in a cloud computing environment (100) is disclosed. The application is executed with one or more application instances (102) in the cloud computing environment (100), wherein each of the one or more application instances (102) obtains input data from a respective input queue (104). A method implementation for supporting the technique comprises determining a wait indicator for at least one of the one or more application instances (102), the wait indicator for an application instance (102) indicating a relation between empty states of the input queue (104) of the application instance (102) and non-empty states of the input queue (104) of the application instance (102), and triggering forwarding of the wait indicator determined for the at least one of the one or more application instances (102) to a load determination component (106).