-
公开(公告)号:US20250148752A1
公开(公告)日:2025-05-08
申请号:US18502719
申请日:2023-11-06
Applicant: QUALCOMM Incorporated
Inventor: Vibashan VISHNUKUMAR SHARMINI , Shubhankar Mangesh BORSE , Hyojin PARK , Debasmit DAS , Munawar HAYAT , Fatih Murat PORIKLI
IPC: G06V10/75 , G06V30/148
Abstract: Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. In an example method, an input image is accessed, and the input image is processed using an image encoder to generate an image embedding tensor. The image embedding tensor is processed using a mask decoder machine learning model to generate a set of mask embedding tensors. A textual input is processed using a text encoder to generate a text embedding tensor. A set of augmented masks is generated based on aggregating the text embedding tensor with the set of mask embedding tensors.