Systems and methods for factual extraction from language model

    公开(公告)号:US12112131B2

    公开(公告)日:2024-10-08

    申请号:US17588043

    申请日:2022-01-28

    CPC classification number: G06F40/279 G06F40/126 G06N3/044

    Abstract: Embodiments described herein provide a system and method for extracting factual information. The system transforms a query into a natural language prompt in a format of a query subject and a queried relation. The system encodes, via an embedding layer of a pre-trained language model, the natural language prompt into a first embedding. The system encodes, via the adapter model, the first embedding into a second embedding based on a probability that the second embedding returns the factual information when the second embedding is fed the first attention layer of the pre-trained language model. The system decodes, by the first attention layer of the pre-trained language mode, the second embedding into a response to the query. The system extracts the factual information from the decoded response to the query.

    Systems and methods for text summarization

    公开(公告)号:US12204847B2

    公开(公告)日:2025-01-21

    申请号:US17938572

    申请日:2022-10-06

    Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.

    SYSTEMS AND METHODS FOR ENSEMBLING SOFT PROMPTS IN FEW-SHOT FINE-TUNING OF LANGUAGE MODELS

    公开(公告)号:US20240070394A1

    公开(公告)日:2024-02-29

    申请号:US18160967

    申请日:2023-01-27

    CPC classification number: G06F40/284 G06F40/40

    Abstract: Embodiments described herein provide a mechanism that ensembles trainable soft prompts to transfer knowledge from source tasks under few-shot learning settings. Specifically, given a source task input from a source task training dataset, a set of soft prompts may be trained using a frozen PLM on the large-scale source task training dataset. The set of soft prompts are then prepended to a target task input, based on which the frozen pre-trained language model generates a set of logits for predicting classification of the target task input, respectively. An attention module is used to generate input-logit attention scores, which are used to compute a weighted linear combination of the logits given the attention scores. The weighted linear combination are the final logits to predict the final classification of the target task input.

    SYSTEMS AND METHODS FOR TEXT SUMMARIZATION
    6.
    发明公开

    公开(公告)号:US20230419017A1

    公开(公告)日:2023-12-28

    申请号:US17938572

    申请日:2022-10-06

    CPC classification number: G06F40/166 G06F40/284 G06N20/00

    Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.

Patent Agency Ranking