-
公开(公告)号:US12198453B2
公开(公告)日:2025-01-14
申请号:US17587161
申请日:2022-01-28
Applicant: Salesforce, Inc.
Inventor: Mingfei Gao , Chen Xing
IPC: G06V20/62 , G06F40/126 , G06T1/60 , G06T9/00 , G06V10/22 , G06V10/77 , G06V10/774
Abstract: Embodiments described herein provide methods and systems for open vocabulary object detection of images. given a pre-trained vision-language model and an image-caption pair, an activation map may be computed in the image that corresponds to an object of interest mentioned in the caption. The activation map is then converted into a pseudo bounding-box label for the corresponding object category. The open vocabulary detector is then directly supervised by these pseudo box-labels, which enables training object detectors with no human-provided bounding-box annotations.
-
2.
公开(公告)号:US20240070394A1
公开(公告)日:2024-02-29
申请号:US18160967
申请日:2023-01-27
Applicant: Salesforce, Inc.
Inventor: Xiangyu Peng , Chen Xing , Prafulla Kumar Choubey , Chieng-Sheng Wu
IPC: G06F40/284 , G06F40/40
CPC classification number: G06F40/284 , G06F40/40
Abstract: Embodiments described herein provide a mechanism that ensembles trainable soft prompts to transfer knowledge from source tasks under few-shot learning settings. Specifically, given a source task input from a source task training dataset, a set of soft prompts may be trained using a frozen PLM on the large-scale source task training dataset. The set of soft prompts are then prepended to a target task input, based on which the frozen pre-trained language model generates a set of logits for predicting classification of the target task input, respectively. An attention module is used to generate input-logit attention scores, which are used to compute a weighted linear combination of the logits given the attention scores. The weighted linear combination are the final logits to predict the final classification of the target task input.
-
公开(公告)号:US20240185035A1
公开(公告)日:2024-06-06
申请号:US18162535
申请日:2023-01-31
Applicant: Salesforce, Inc.
Inventor: Ning Yu , Can Qin , Chen Xing , Shu Zhang , Stefano Ermon , Caiming Xiong , Ran Xu
IPC: G06N3/0455 , G06T5/00
CPC classification number: G06N3/0455 , G06T5/002 , G06T2207/20084
Abstract: Embodiments described herein provide a mechanism for replacing existing text encoders in text-to-image generation models with more powerful pre-trained language models. Specifically, a translation network is trained to map features from the pre-trained language model output into the space of the target text encoder. The training preserves the rich structure of the pre-trained language model while allowing it to operate within the text-to-image generation model. The resulting modularized text-to-image model receives prompt and generates an image representing the features contained in the prompt.
-
公开(公告)号:US20240070868A1
公开(公告)日:2024-02-29
申请号:US18159318
申请日:2023-01-25
Applicant: Salesforce, Inc.
Inventor: Ning Yu , Vibashan Vishnukumar Sharmini , Chen Xing , Juan Carlos Niebles Duque , Ran Xu
CPC classification number: G06T7/11 , G06V10/273
Abstract: Embodiments described herein provide an open-vocabulary instance segmentation framework that adopts a pre-trained vision-language model to develop a pipeline in detecting novel categories of instances.
-
公开(公告)号:US20230092702A1
公开(公告)日:2023-03-23
申请号:US17933396
申请日:2022-09-19
Applicant: Salesforce, Inc.
Inventor: Yixin Mao , Tian Xie , Chaney Lin , Chen Xing , Zachary Alexander , Wenhao Liu
IPC: G06F40/35 , G06F16/35 , G06F16/383
Abstract: Database systems and methods are provided for assigning structural metadata to records and creating automations using the structural metadata. One method of assigning structural metadata to a group of records involves determining, based on one or more fields of metadata associated with the records, a plurality of candidate names, wherein each candidate name of the plurality of candidate names corresponds to semantic content of the one or more fields of a respective record of the group of records, for each candidate name, assigning a name relevance score based on respective word relevance scores assigned to respective words of the respective candidate name based on usage, selecting a candidate name in a manner that is influenced by the respective name relevance scores assigned to the respective candidate names and automatically assigning a name to the group of records using the candidate name.
-
公开(公告)号:US20240330409A1
公开(公告)日:2024-10-03
申请号:US18738628
申请日:2024-06-10
Applicant: Salesforce, Inc.
Inventor: Chen Xing , Wenhao Liu , Chu Hong Hoi , Nitish Shirish Keskar , Caiming Xiong
IPC: G06F18/214 , G06F18/21 , G06F40/00
CPC classification number: G06F18/2148 , G06F18/2163 , G06F40/00
Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.
-
-
-
-
-