Patent search ap:("Tata Consultancy Services Limited") AND inv:"Ravi Kumar SINGH" Page 1

1.

发明申请
DISTRIBUTED ARCHITECTURE FOR FUSION-TRANSFORMER TRAINING ACCELERATION 有权

公开(公告)号：US20240420464A1

公开(公告)日：2024-12-19

申请号：US18742019

申请日：2024-06-13

Applicant: Tata Consultancy Services Limited

Inventor： Shruti Kunal KUNDE , Ravi Kumar SINGH , Chaman BANOLIA , Rekha SINGHAL , Balamuralidhar PURUSHOTHAMAN , Shailesh Shankar DESHPANDE

IPC: G06V20/10 , G06V10/26 , G06V10/762 , G06V10/764 , G06V10/766 , G06V10/77

Abstract: The disclosure addresses problems associated with a systematic integration of multi-modal data for effective training, and handling of large volume of data because of high resolution of the multiple modalities. Embodiments herein provide a method and a system for a distributed training of a multi-modal data fusion transformer. Herein, a distributed training approach called a Distributed Architecture for Fusion-Transformer Training Acceleration (DAFTA) is proposed for processing large multimodal remote sensing data. DAFTA is enabled to handle any combination of remote sensing modalities. Additionally, similarity of feature space is leveraged to optimize the training process and to achieve the training with reduced data set which is equivalent to a complete data set. The proposed approach provides a systematic and efficient method for managing large sensing data and enables accurate and timely insights for various applications.

2.

发明公开
TRAINING LARGE DL MODELS VIA SERVERLESS ARCHITECTURE USING CLOUD STORAGE SERVICES-BASED COMMUNICATION CHANNEL 审中-公开

公开(公告)号：US20230409967A1

公开(公告)日：2023-12-21

申请号：US18140219

申请日：2023-04-27

Applicant: Tata Consultancy Services Limited

Inventor： Dheeraj CHAHAL , Surya Chaitanya Venkata PALEPU , Mayank MISHRA , Ravi Kumar SINGH , Rekha SINGHAL

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: State of the art methods require size of DL model, or its gradients be less than maximum data item size of storage used as a communication channel for model training with serverless platform. Embodiments of the present disclosure provide method and system for training large DL models via serverless architecture using communication channel when the gradients are larger than maximum size of one data item allowed by the channel. Gradients that are generated by each worker during current training instance, are chunked into segments and stored in the communication channel. Corresponding segments of each worker are aggregated by aggregators and stored back. Each of the aggregated corresponding segments are read by each worker to generate an aggregated model to be used during successive training instance. Optimization techniques are used for reading-from and writing-to the channel resulting in significant improvement in performance and cost of training.

Patent Agency Ranking