Generating commonsense context for text using knowledge graphs

    公开(公告)号:US12265792B2

    公开(公告)日:2025-04-01

    申请号:US17526824

    申请日:2021-11-15

    Applicant: ADOBE INC.

    Abstract: Methods and systems are provided for facilitating generation and utilization of a commonsense contextualizing machine learning (ML) model, in accordance with embodiments described herein. In embodiments, a commonsense contextual ML model is trained by fine-tuning a pre-trained language model using a set of training path-sentence pairs. Each training path-sentence pair includes a commonsense path, identified via a commonsense knowledge graph, and a natural language sentence identified as contextually related to the commonsense path. The trained commonsense contextualizing ML model can then be used to generate a commonsense inference path for a text input. Such a commonsense inference path can include a sequence of entities and relations that provide commonsense context to the text input. Thereafter, the commonsense inference path can be provided to a natural language processing system for use in performing a natural language processing task.

    PERSONALIZED FORM ERROR CORRECTION PROPAGATION

    公开(公告)号:US20240362941A1

    公开(公告)日:2024-10-31

    申请号:US18140143

    申请日:2023-04-27

    Applicant: Adobe Inc.

    CPC classification number: G06V30/274 G06V30/1444 G06V30/19147 G06V30/414

    Abstract: A corrective noise system receives an electronic version of a fillable form generated by a segmentation network and receives a correction to a segmentation error in the electronic version of the fillable form. The corrective noise system is trained to generate noise that represents the correction and superimpose the noise on the fillable form. The corrective noise system is further trained to identify regions in a corpus of forms that are semantically similar to a region that was subject to the correction. The generated noise is propagated to the semantically similar regions in the corpus of forms and the noisy corpus of forms is provided as input to the segmentation network. The noise causes the segmentation network to accurately identify fillable regions in the corpus of forms and output a segmented version of the corpus of forms having improved fidelity without retraining or otherwise modifying the segmentation network.

    SYSTEMS AND METHODS FOR GENERATING SYNTHETIC TABULAR DATA FOR MACHINE LEARNING AND OTHER APPLICATIONS

    公开(公告)号:US20240330682A1

    公开(公告)日:2024-10-03

    申请号:US18295094

    申请日:2023-04-03

    Applicant: Adobe Inc.

    CPC classification number: G06N3/08 G06N3/0455

    Abstract: Systems and methods for generating synthetic tabular data for machine learning and other applications are provided. In some embodiments, a variational autoencoder is trained to learn inter-feature correlations found in tabular data collected from real data sources. The trained variational autoencoder is used to train a generator model of a Generative Adversarial Network (GAN) to generate synthetic tabular data that exhibits the inter-feature correlation distribution found in the tabular data collected from real data sources. In some embodiments, processing devices perform operations comprising: receiving a set of tabular data records, each record comprising a plurality of features; training a first machine learning model using the tabular data records to learn correlations between the plurality of features; and training a second machine learning model, using the first machine learning model, to generate a synthetic tabular data records based at least on the one or more correlations between the plurality of features.

    FORM STRUCTURE SIMILARITY DETECTION
    6.
    发明公开

    公开(公告)号:US20240330351A1

    公开(公告)日:2024-10-03

    申请号:US18190686

    申请日:2023-03-27

    Applicant: Adobe Inc.

    CPC classification number: G06F16/383 G06F16/332 G06V30/19147 G06V30/412

    Abstract: Form structure similarity detection techniques are described. A content processing system, for instance, receives a query snippet that depicts a query form structure. The content processing system generates a query layout string that includes semantic indicators to represent the query form structure and generates candidate layout strings that represent form structures from a target document. The content processing system calculates similarity scores between the query layout string and the candidate layout strings. Based on the similarity scores, the content processing system generates a target snippet for display that depicts a form structure that is structurally similar to the query form structure. The content processing system is further operable to generate a training dataset that includes image pairs of snippets depicting form structures that are structurally similar. The content processing system utilizes the training dataset to train a machine learning model to perform form structure similarity matching.

    Systems and methods of training neural networks against adversarial attacks

    公开(公告)号:US11468314B1

    公开(公告)日:2022-10-11

    申请号:US16129553

    申请日:2018-09-12

    Applicant: ADOBE INC.

    Abstract: Embodiments disclosed herein describe systems, methods, and products that generate trained neural networks that are robust against adversarial attacks. During a training phase, an illustrative computer may iteratively optimize a loss function that may include a penalty for ill-conditioned weight matrices in addition to a penalty for classification errors. Therefore, after the training phase, the trained neural network may include one or more well-conditioned weight matrices. The one or more well-conditioned weight matrices may minimize the effect of perturbations within an adversarial input thereby increasing the accuracy of classification of the adversarial input. By contrast, conventional training approaches may merely reduce the classification errors using backpropagation, and, as a result, any perturbation in an input is prone to generate a large effect on the output.

Patent Agency Ranking