HARDWARE ADAPTIVE MULTI-MODEL SCHEDULING
    82.
    发明公开

    公开(公告)号:US20240185587A1

    公开(公告)日:2024-06-06

    申请号:US18556636

    申请日:2021-08-16

    CPC classification number: G06V10/776 G06V10/82 G06V10/955

    Abstract: Modem deep neural network (DNN) models have many layers with a single layer potentially involving large matrix multiplications. Such heavy calculation brings challenges to deploy such DNN models on a single edge device, which has relatively limited computation resources. Therefore, multiple and even heterogeneous edge devices may be required for applications with stringent latency requirements. Disclosed in the present patent documents are embodiments of a model scheduling framework that schedules multiple models on a heterogeneous platform. Two different approaches, model first scheduling (MFS) and hardware first scheduling (HFS), are presented to allocate a group of models for a service into corresponding heterogeneous edge devices, including CPU, VPU and GPU. Experimental results prove the effectiveness of the MFS and HFS methods for improving the inference speed of single and multiple AI-based services.

    Self-optimizing video analytics pipelines

    公开(公告)号:US12001513B2

    公开(公告)日:2024-06-04

    申请号:US17522226

    申请日:2021-11-09

    CPC classification number: G06F18/217 G06F9/5027 G06N3/08 G06V10/94 G06V20/46

    Abstract: A method for implementing a self-optimized video analytics pipeline is presented. The method includes decoding video files into a sequence of frames, extracting features of objects from one or more frames of the sequence of frames of the video files, employing an adaptive resource allocation component based on reinforcement learning (RL) to dynamically balance resource usage of different microservices included in the video analytics pipeline, employing an adaptive microservice parameter tuning component to balance accuracy and performance of a microservice of the different microservices, applying a graph-based filter to minimize redundant computations across the one or more frames of the sequence of frames, and applying a deep-learning-based filter to remove unnecessary computations resulting from mismatches between the different microservices in the video analytics pipeline.

Patent Agency Ranking