Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Anand PADMANABHA IYER"

1.

发明申请
AUTOMATIC ML PIPELINE PLANNING FOR LIVE ML ANALYTICS 有权

公开(公告)号：US20240412096A1

公开(公告)日：2024-12-12

申请号：US18208173

申请日：2023-06-09

Applicant: Microsoft Technology Licensing, LLC

Inventor： Anand PADMANABHA IYER , Ganesh ANANTHANARAYANAN , Yiwen ZHANG

IPC: G06N20/00

Abstract: Optimizing ML pipeline deployment using an ML pipeline management system. A method includes receiving an indication of an input data source and input data type from the input data source. An indication of a plurality filters to be included in the pipeline, an ML model, and predetermined performance criteria is received. The method includes determining a physical topology of the ML pipeline and configuration of the filters or the ML model. The determined physical topology includes placement of the filters and the model, and the configuration. The determined physical topology satisfies the performance criteria. The filters and ML model are placed across an infrastructure, comprising a plurality of tiers, according to the determined physical topology.

2.

发明公开
DEEP NEURAL NETWORKS (DNN) INFERENCE USING PRACTICAL EARLY EXIT NETWORKS 审中-公开

公开(公告)号：US20230342278A1

公开(公告)日：2023-10-26

申请号：US17725825

申请日：2022-04-21

Applicant: Microsoft Technology Licensing, LLC

Inventor： Anand PADMANABHA IYER , Swapnil Sunilkumar GANDHI

IPC: G06F11/34 , G06F9/50 , G06N5/04

CPC classification number: G06F11/3442 , G06F9/505 , G06N5/043 , G06N20/00

Abstract: The present disclosure relates to methods and systems for providing inferences using machine learning systems. The methods and systems receive a load forecast for processing requests by a machine learning model and split the machine learning model into a plurality machine learning model portions based on the load forecast. The methods and systems determine a batch size for the requests for the machine learning model portions. The methods and systems use one or more available resources to execute the plurality of machine learning model portions to process the requests and generate inferences for the requests.

3.

发明申请
CONFIGURATION OF COMPUTE RESOURCES TO PERFORM TASK USING ENSEMBLE 有权

公开(公告)号：US20240370781A1

公开(公告)日：2024-11-07

申请号：US18332589

申请日：2023-06-09

Applicant: Microsoft Technology Licensing, LLC

Inventor： Anand PADMANABHA IYER , Jayashree MOHAN , Ranjita BHAGWAN , Nagarajan NATARAJAN , Venkata N. PADMANABHAN , Rohit MALLIKARJUNA PUSHPA , Divyam ANSHUMAAN

IPC: G06N20/20

Abstract: Computer-assisted configuration of compute resource to perform tasks of a given inference task type. For each of multiple model combinations, the computing system estimates 1) a compute level that can perform tasks of the given inference type using the model combination, and 2) an accuracy of the model combination in performing tasks of the given inference task type. The computing system then selects a model combination for the given inference task type based on the estimated compute level of the model combination and the estimated accuracy of the model combination. In response to the selection, an inference component is configured to respond to task requests of the given inference task type by using the selected model combination. Scheduling using batch size and input size may further improve accuracy and efficiency of the model combination.

4.

发明申请
MERGING MODELS ON AN EDGE SERVER 有权

公开(公告)号：US20220383188A1

公开(公告)日：2022-12-01

申请号：US17471816

申请日：2021-09-10

Applicant: Microsoft Technology Licensing, LLC

Inventor： Ganesh ANANTHANARAYANAN , Anand PADMANABHA IYER , Yuanchao SHU , Nikolaos KARIANAKIS , Arthi Hema PADMANABHAN

IPC: G06N20/00 , G06K9/00 , G06K9/62

Abstract: Systems and methods are provided for merging models for use in an edge server under the multi-access edge computing environment. In particular, a model merger selects a layer of a model based on a level of memory consumption in the edge server and determines sharable layers based on common properties of the selected layer. The model merger generates a merged model by generating a single instantiation of a layer that corresponds to the sharable layers. A model trainer trains the merged model based on training data for the respective models to attain a level of accuracy of data analytics above a predetermined threshold. The disclosed technology further refreshes the merged model upon observing a level of data drift that exceeds a predetermined threshold. The refreshing of the merged model includes detaching and/or splitting consolidated sharable layers of sub-models in the merged model. By merging models, the disclosed technology reduces memory footprints of models used in the edge server, rectifying memory scarcity issues in the edge server.

Patent Agency Ranking