Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Ramachandran RAMJEE"

1.

发明申请
TRANSPARENT PRE-EMPTION AND MIGRATION FOR PLANET-SCALE COMPUTER 有权

公开(公告)号：US20250094212A1

公开(公告)日：2025-03-20

申请号：US18961400

申请日：2024-11-26

Applicant: Microsoft Technology Licensing, LLC

Inventor： Muthian SIVATHANU , Srinidhi VISWANATHA , Dharma Kiritkumar SHUKLA , Nipun KWATRA , Ramachandran RAMJEE , Rimma Vladimirovna NEHME , Pankaj SHARMA , Bhalakumaaran Erode RANGANATHAN , Vaibhav SHARMA

IPC: G06F9/48 , G06F9/46 , G06F9/54 , G06N3/08 , G06T1/20 , G06T1/60 , H04L67/568

Abstract: The disclosure herein describes platform-level checkpointing for deep learning (DL) jobs. The checkpointing is performed through capturing two kinds of state data: (i) GPU state (device state), and (ii) CPU state (host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by the libraries such as DNN, Blas, etc.). Only a fraction of the GPU memory is copied because the checkpointing is done in a domain-aware manner. The “active” memory contains useful data like model parameters. To be able to capture the useful data, memory management is controlled to identify which parts of the memory are active. Also, to restore the destination GPU to the same context/state, a mechanism is used to capture such state-changing events on an original GPU and replayed on a destination GPU.

2.

发明申请
SCHEDULER FOR PLANET-SCALE COMPUTING SYSTEM 有权

公开(公告)号：US20220318052A1

公开(公告)日：2022-10-06

申请号：US17361224

申请日：2021-06-28

Applicant: Microsoft Technology Licensing, LLC

Inventor： Muthian SIVATHANU , Atul KATIYAR , Dharma Kiritkumar SHUKLA , Rimma Vladimirovna NEHME , Shreshth SINGHAL , Pankaj SHARMA , Nipun KWATRA , Ramachandran RAMJEE

IPC: G06F9/48 , G06F9/50

Abstract: The disclosure herein describes scheduling execution of artificial intelligence (AI) workloads in a cloud infrastructure platform. A global scheduler receives AI workloads associated with resource ticket values. The scheduler distributes the AI workloads to nodes based on balancing resource ticket values. Local schedulers of the nodes schedule AI workloads on resources based on the resource ticket values of the AI workloads. Based on scheduling the AI workloads, coordinator services of the local schedulers execute the distributed AI workloads on the infrastructure resources of the nodes. The disclosure further describes scheduling AI workloads based on priority tiers. A scheduler receives AI workloads, and each AI workload is associated with a priority tier indicative of a preemption priority while being executed. The AI workloads are scheduled for execution on a distributed set of nodes based on the priority tiers and then execute based on the scheduling.

3.

发明申请
TRANSPARENT PRE-EMPTION AND MIGRATION FOR PLANET-SCALE COMPUTER 有权

公开(公告)号：US20220308917A1

公开(公告)日：2022-09-29

申请号：US17359553

申请日：2021-06-26

Applicant: Microsoft Technology Licensing, LLC

Inventor： Muthian SIVATHANU , Srinidhi VISWANATHA , Dharma Kiritkumar SHUKLA , Nipun KWATRA , Ramachandran RAMJEE , Rimma Vladimirovna NEHME , Pankaj SHARMA , Bhalakumaaran Erode RANGANATHAN , Vaibhav SHARMA

IPC: G06F9/48 , G06N3/08 , G06F9/46 , G06F9/54 , G06T1/20 , G06T1/60 , H04L29/08

Abstract: The disclosure herein describes platform-level checkpointing for deep learning (DL) jobs. The checkpointing is performed through capturing two kinds of state data: (i) GPU state (device state), and (ii) CPU state (host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by the libraries such as DNN, Blas, etc.). Only a fraction of the GPU memory is copied because the checkpointing is done in a domain-aware manner. The “active” memory contains useful data like model parameters. To be able to capture the useful data, memory management is controlled to identify which parts of the memory are active. Also, to restore the destination GPU to the same context/state, a mechanism is used to capture such state-changing events on an original GPU and replayed on a destination GPU.

4.

发明公开
ELASTICALLY MANAGING WORKERS OF MULTI-WORKER WORKLOADS ON ACCELERATOR DEVICES 审中-公开

公开(公告)号：US20230236837A1

公开(公告)日：2023-07-27

申请号：US17855722

申请日：2022-06-30

Applicant: Microsoft Technology Licensing, LLC

Inventor： Muthian SIVATHANU , Srinidhi VISWANATHA , Bhargav GULAVANI , Dharma Kiritkumar SHUKLA , Rimma Vladimirovna NEHME , Amey AGRAWAL , Ramachandran RAMJEE , Kaustubh WELANKAR , Ravi Shreyas ANUPINDI

IPC: G06F9/38 , G06F9/448 , G06F11/14 , G06F11/10 , G06F9/50

CPC classification number: G06F9/3893 , G06F9/448 , G06F11/1407 , G06F11/1004 , G06F9/5016

Abstract: The disclosure herein describes elastically managing the execution of workers of multi-worker workloads on accelerator devices. A first worker of a workload is executed on an accelerator device during a first time interval. A first context switch point is identified when the first worker is in a first worker state. At the identified context switch point, a first memory state of the first worker is stored in a host memory and the accelerator device is configured to a second memory state of the second worker. The second worker is executed during a second time interval and a second context switch point is identified at the end of the second time interval when the second worker is in a state that is equivalent to the first worker state. During the intervals, collective communication operations between the workers are accumulated and, at the second context switch point, the accumulated operations are performed.

5.

发明申请
AUTOMATIC CAMERA CALIBRATION 审中-公开

公开(公告)号：US20190311494A1

公开(公告)日：2019-10-10

申请号：US15946580

申请日：2018-04-05

Applicant: Microsoft Technology Licensing, LLC

Inventor： Ganesan RAMALINGAM , Ramachandran RAMJEE , Romil BHARDWAJ , Gopi Krishna TUMMALA

IPC: G06T7/80 , G06T7/73 , G06K9/00 , G06K9/62

Abstract: This document relates to camera calibration. One example uses real-world distances and image coordinates of object features in images to determine multiple candidate camera calibrations for a camera. This example filters out at least some of the multiple candidate camera calibrations to obtain remaining calibrations, and obtains a final calibration for the camera from the remaining calibrations

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification