Systems and methods for open vocabulary object detection

    公开(公告)号:US12198453B2

    公开(公告)日:2025-01-14

    申请号:US17587161

    申请日:2022-01-28

    Abstract: Embodiments described herein provide methods and systems for open vocabulary object detection of images. given a pre-trained vision-language model and an image-caption pair, an activation map may be computed in the image that corresponds to an object of interest mentioned in the caption. The activation map is then converted into a pseudo bounding-box label for the corresponding object category. The open vocabulary detector is then directly supervised by these pseudo box-labels, which enables training object detectors with no human-provided bounding-box annotations.

    SYSTEMS AND METHODS FOR ENSEMBLING SOFT PROMPTS IN FEW-SHOT FINE-TUNING OF LANGUAGE MODELS

    公开(公告)号:US20240070394A1

    公开(公告)日:2024-02-29

    申请号:US18160967

    申请日:2023-01-27

    CPC classification number: G06F40/284 G06F40/40

    Abstract: Embodiments described herein provide a mechanism that ensembles trainable soft prompts to transfer knowledge from source tasks under few-shot learning settings. Specifically, given a source task input from a source task training dataset, a set of soft prompts may be trained using a frozen PLM on the large-scale source task training dataset. The set of soft prompts are then prepended to a target task input, based on which the frozen pre-trained language model generates a set of logits for predicting classification of the target task input, respectively. An attention module is used to generate input-logit attention scores, which are used to compute a weighted linear combination of the logits given the attention scores. The weighted linear combination are the final logits to predict the final classification of the target task input.

    DATABASE SYSTEMS AND METHODS OF NAMING RECORD GROUPS

    公开(公告)号:US20230092702A1

    公开(公告)日:2023-03-23

    申请号:US17933396

    申请日:2022-09-19

    Abstract: Database systems and methods are provided for assigning structural metadata to records and creating automations using the structural metadata. One method of assigning structural metadata to a group of records involves determining, based on one or more fields of metadata associated with the records, a plurality of candidate names, wherein each candidate name of the plurality of candidate names corresponds to semantic content of the one or more fields of a respective record of the group of records, for each candidate name, assigning a name relevance score based on respective word relevance scores assigned to respective words of the respective candidate name based on usage, selecting a candidate name in a manner that is influenced by the respective name relevance scores assigned to the respective candidate names and automatically assigning a name to the group of records using the candidate name.

    PARAMETER UTILIZATION FOR LANGUAGE PRE-TRAINING

    公开(公告)号:US20240330409A1

    公开(公告)日:2024-10-03

    申请号:US18738628

    申请日:2024-06-10

    CPC classification number: G06F18/2148 G06F18/2163 G06F40/00

    Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.

Patent Agency Ranking