-
公开(公告)号:US20220374595A1
公开(公告)日:2022-11-24
申请号:US17531591
申请日:2021-11-19
Applicant: salesforce.com, inc.
Inventor: Akhilesh Deepak Gotmare , Junnan Li , Shafiq Rayhan Joty , Chu Hong Hoi
IPC: G06F40/226 , G06F40/40 , G06F40/30 , G06F40/151
Abstract: Embodiments described herein provides a contrastive learning framework that leverages hard negative examples, that are mined globally from the entire training corpus for a given query to improve the quality of code and natural language representations. Specifically, similar examples from the training corpus are extracted and used as hard negatives in an online manner during training while keeping the minibatch construction random.
-
公开(公告)号:US20220156591A1
公开(公告)日:2022-05-19
申请号:US17160896
申请日:2021-01-28
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: Embodiments described herein provide an approach (referred to as “Co-training” mechanism throughout this disclosure) that jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings. Specifically, two representations of each image sample are generated: a class probability produced by the classification head and a low-dimensional embedding produced by the projection head. The classification head is trained using memory-smoothed pseudo-labels, where pseudo-labels are smoothed by aggregating information from nearby samples in the embedding space. The projection head is trained using contrastive learning on a pseudo-label graph, where samples with similar pseudo-labels are encouraged to have similar embeddings.
-
公开(公告)号:US20220156530A1
公开(公告)日:2022-05-19
申请号:US17188232
申请日:2021-03-01
Applicant: salesforce.com, inc.
Inventor: Anthony Meng Huat Tiong , Junnan Li , Chu Hong Hoi
Abstract: An interpolative centroid contrastive learning (ICCL) framework is disclosed for learning a more discriminative representation for tail classes. Specifically, data samples, such as natural images, are projected into a low-dimensional embedding space, and class centroids for respective classes are created as average embeddings of samples that belong to a respective class. Virtual training samples are then created by interpolating two images from two samplers: a class-agnostic sampler which returns all images from both the head class and the tail class with an equal probability, and a class-aware sampler which focuses more on tail-class images by sampling images from the tail class with a higher probability compared to images from the head class. The sampled images, e.g., images from the class-agnostic sampler and images from the class-aware sampler may be interpolated to generate interpolated images.
-
公开(公告)号:US20220156507A1
公开(公告)日:2022-05-19
申请号:US17591121
申请日:2022-02-02
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.
-
公开(公告)号:US20210295091A1
公开(公告)日:2021-09-23
申请号:US16870621
申请日:2020-05-08
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.
-
公开(公告)号:US12210976B2
公开(公告)日:2025-01-28
申请号:US17219339
申请日:2021-03-31
Applicant: Salesforce.com, Inc.
Inventor: Hualin Liu , Chu Hong Hoi , Junnan Li
IPC: G06N3/084 , G06F18/214 , G06F18/22 , G06N3/088 , G06V10/75
Abstract: Embodiments described herein provide systems and methods for learning representation from unlabeled videos. Specifically, a method may comprise generating a set of strongly-augmented samples and a set of weakly-augmented samples from the unlabeled video samples; generating a set of predictive logits by inputting the set of strongly-augmented samples into a student model and a first teacher model; generating a set of artificial labels by inputting the set of weakly-augmented samples to a second teacher model that operates in parallel to the first teacher model, wherein the second teacher model shares one or more model parameters with the first teacher model; computing a loss objective based on the set of predictive logits and the set of artificial labels; updating student model parameters based on the loss objective via backpropagation; and updating the shared parameters for the first teacher model and the second teacher model based on the updated student model parameters.
-
公开(公告)号:US11776236B2
公开(公告)日:2023-10-03
申请号:US17591121
申请日:2022-02-02
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
IPC: G06K9/62 , G06V10/44 , G06T7/73 , G06F18/23 , G06F18/214 , G06V10/762 , G06V10/774 , G06V10/776 , G06V10/82
CPC classification number: G06V10/454 , G06F18/2155 , G06F18/23 , G06T7/73 , G06V10/763 , G06V10/776 , G06V10/7753 , G06V10/82 , G06T2207/20084
Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.
-
公开(公告)号:US20230154188A1
公开(公告)日:2023-05-18
申请号:US17566173
申请日:2021-12-30
Applicant: salesforce.com, inc.
Inventor: Dongxu Li , Junnan Li , Chu Hong Hoi
IPC: G06V20/40 , G06V10/74 , G06V10/26 , G06V10/80 , G06F40/284
CPC classification number: G06V20/41 , G06V10/761 , G06V20/47 , G06V10/26 , G06V10/806 , G06F40/284
Abstract: Embodiments described a method of video-text pre-learning to effectively learn cross-modal representations from sparse video frames and text. Specifically, an align and prompt framework provides a video and language pre-training framework that encodes the frames and text independently using a transformer-based video encoder and a text encoder. A multi-modal encoder is then employed to capture cross-modal interaction between a plurality of video frames and a plurality of texts. The pre-training includes a prompting entity modeling that enables the model to capture fine-grained region-entity alignment.
-
-
-
-
-
-
-