CROSS-MODAL TRANSFER WITH CONTINUOUSLY WEIGHTED CONTRASTIVE LOSS

    公开(公告)号:US20240394592A1

    公开(公告)日:2024-11-28

    申请号:US18434691

    申请日:2024-02-06

    Abstract: A method includes accessing a training dataset having multiple samples, where each sample includes a data point for each of multiple modalities. The method also includes generating, using a first encoder associated with a first modality of the multiple modalities, first modality embeddings for data points of the first modality in the training dataset. The method further includes, for each first modality embedding, determining a similarity metric to other first modality embeddings. The method also includes generating, using a second encoder associated with a second modality of the multiple modalities, second modality embeddings for data points of the second modality in the training dataset. In addition, the method includes training the second encoder based on a contrastive loss function to align the first modality embeddings and the second modality embeddings from different samples of the training dataset, where the contrastive loss function is weighed using the similarity metrics.

    Method and system for detecting unsupported utterances in natural language understanding

    公开(公告)号:US11854528B2

    公开(公告)日:2023-12-26

    申请号:US17402045

    申请日:2021-08-13

    CPC classification number: G10L15/02 G10L15/18

    Abstract: An apparatus for detecting unsupported utterances in natural language understanding, includes a memory storing instructions, and at least one processor configured to execute the instructions to classify a feature that is extracted from an input utterance of a user, as one of in-domain and out-of-domain (OOD) for a response to the input utterance, obtain an OOD score of the extracted feature, and identify whether the feature is classified as OOD. The at least one processor is further configured to executed the instructions to, based on the feature being identified to be classified as in-domain, identify whether the obtained OOD score is greater than a predefined threshold, and based on the OOD score being identified to be greater than the predefined threshold, re-classify the feature as OOD.

    SMALL AND FAST TRANSFORMER MODEL FOR MULTI-MODAL OR OTHER TASKS

    公开(公告)号:US20230177338A1

    公开(公告)日:2023-06-08

    申请号:US18073383

    申请日:2022-12-01

    CPC classification number: G06N3/082 G06V10/82 G06V10/772

    Abstract: A method includes obtaining, using a first electronic device, a weight matrix associated with a trained transformer model. The method also includes factorizing the weight matrix into a dictionary weight matrix and an intermediate matrix. The method further includes pruning the intermediate matrix to generate a sparse intermediate matrix. The method also includes fine-tuning the sparse intermediate matrix based on a training dataset to generate a fine-tuned sparse intermediate matrix. The method further includes determining an index matrix and a coefficient matrix based on the fine-tuned sparse intermediate matrix. In addition, the method includes deploying the dictionary weight matrix, the index matrix, and the coefficient matrix to a second electronic device without deploying the weight matrix to the second electronic device. A number of parameters in the dictionary weight matrix, the index matrix, and the coefficient matrix is smaller than a number of parameters in the weight matrix.

    Structured Pruning of Vision Transformer

    公开(公告)号:US20230073835A1

    公开(公告)日:2023-03-09

    申请号:US17900126

    申请日:2022-08-31

    Abstract: In one embodiment, a method includes accessing a batch B of a plurality of images, wherein each image in the batch is part of a training set of images used to train a vision transformer comprising a plurality of attention heads. The method further includes determining, for each attention head A, a similarity between (1) the output of the attention head evaluated using each image in the batch and the (2) output of each attention head evaluated using each image in the batch. The method further includes determining, based on the determined similarities, an importance score for each attention head; and pruning, based on the importance scores, one or more attention heads from the vision transformer.

    System and method for automating natural language understanding (NLU) in skill development

    公开(公告)号:US11501753B2

    公开(公告)日:2022-11-15

    申请号:US16728672

    申请日:2019-12-27

    Abstract: A method includes receiving, from an electronic device, information defining a user utterance associated with a skill to be performed, where the skill is not recognized by a natural language understanding (NLU) engine. The method also includes receiving, from the electronic device, information defining one or more actions for performing the skill. The method further includes identifying, using at least one processor, one or more known skills having one or more slots that map to at least one word or phrase in the user utterance. The method also includes creating, using the at least one processor, a plurality of additional utterances based on the one or more mapped slots. In addition, the method includes training, using the at least one processor, the NLU engine using the plurality of additional utterances.

    Visual object instance segmentation using foreground-specialized model imitation

    公开(公告)号:US11430124B2

    公开(公告)日:2022-08-30

    申请号:US16946504

    申请日:2020-06-24

    Abstract: A method includes training, using at least one processor, a specialized teacher model to perform visual object instance segmentation in order to segment and classify objects in first training images. The first training images contain foreground objects without backgrounds. The method also includes training, using the at least one processor, a student model to perform visual object instance segmentation in order to segment and classify objects in second training images. The second training images contain the foreground objects and the backgrounds. Training the student model includes using selected outputs of the specialized teacher model. The method further includes deploying the trained student model to perform visual object instance segmentation in an external device.

    On-device lightweight natural language understanding (NLU) continual learning

    公开(公告)号:US11423225B2

    公开(公告)日:2022-08-23

    申请号:US16946746

    申请日:2020-07-02

    Abstract: A method includes obtaining, using at least one processor of an electronic device, a base model trained to perform natural language understanding. The method also includes generating, using the at least one processor, a first model expansion based on knowledge from the base model. The method further includes training, using the at least one processor, the first model expansion based on first utterances without modifying parameters of the base model. The method also includes receiving, using the at least one processor, an additional utterance from a user. In addition, the method includes determining, using the at least one processor, a meaning of the additional utterance using the base model and the first model expansion.

Patent Agency Ranking