RESPONDING TO HALLUCINATIONS IN GENERATIVE LARGE LANGUAGE MODELS

    公开(公告)号:US20250094866A1

    公开(公告)日:2025-03-20

    申请号:US18678914

    申请日:2024-05-30

    Abstract: Techniques for correcting hallucinations produced by generative large language models (LLMs). In one technique, a computing system accesses first output generated by an LLM. The computing system identifies, within the first output, a plurality of assertions. The computing system determines that a first assertion in the plurality of assertions is false. The computing system generates a prompt that indicates that the first assertion is false. The computing system submits the prompt as input to the LLM. The computing system accesses second output that is generated by the LLM, where the second output includes a second assertion that is different than the first assertion and corresponds to the first assertion.

    FINE-TUNING A LARGE LANGUAGE MODEL (LLM) TO REDUCE THE INSTABILITY OF LLM OUTPUTS TO VARIATIONS IN PROMPTS

    公开(公告)号:US20250094814A1

    公开(公告)日:2025-03-20

    申请号:US18824570

    申请日:2024-09-04

    Abstract: Techniques are provided for fine-tuning large language models (LLMs) to reduce the instability of LLM outputs to prompt. In one technique, a plurality of prompts is stored. For each prompt of the plurality of prompts, a plurality of variants of that prompt is generated. A prompt generating LLM is fine-tuned based on that prompt and the plurality of variants. Each variant-prompt association (where the variant is generated based on the prompt and has an identical or similar meaning) is a training sample that is used to train or fine-tune the prompt generating LLM. The prompt generating LLM is configured to generate standardized prompts based on input prompts. In another technique, a response generating LLM is fine-tuned based on sets of training samples, each training sample in a set comprising a different variant of a prompt and a response that the response generating LLM generated based on the prompt.

    Techniques for dynamic time-based custom model generation

    公开(公告)号:US12230020B2

    公开(公告)日:2025-02-18

    申请号:US17586583

    申请日:2022-01-27

    Abstract: Techniques are disclosed for dynamic time-based custom model generation as part of infrastructure-as-a-service (IaaS) environment. A custom model generation service may receive a set of training data and a time-based constraints for training a machine learning model. The custom model generation service may subsample the training data and generate a set of optimized tuned hyperparameters for a machine learning model to be trained using the subsampled training data. An experimental interval time of training is determined and the machine learning model is trained on the subsampled training data according to the optimized tuned hyperparameters over a set of training intervals similar to the experimental time interval. A customized machine learning model trained in the time-based constraint is output. The hyperparameter tuning may be performed using a modified mutating genetic algorithm for a set of hyperparameters to determine the optimized tuned hyperparameters prior to the training.

    TECHNIQUES FOR DYNAMIC TIME-BASED CUSTOM MODEL GENERATION

    公开(公告)号:US20230237787A1

    公开(公告)日:2023-07-27

    申请号:US17586583

    申请日:2022-01-27

    CPC classification number: G06V10/82 G06N3/086 G06V10/774 G06V10/7788

    Abstract: Techniques are disclosed for dynamic time-based custom model generation as part of infrastructure-as-a-service (IaaS) environment. A custom model generation service may receive a set of training data and a time-based constraints for training a machine learning model. The custom model generation service may subsample the training data and generate a set of optimized tuned hyperparameters for a machine learning model to be trained using the subsampled training data. An experimental interval time of training is determined and the machine learning model is trained on the subsampled training data according to the optimized tuned hyperparameters over a set of training intervals similar to the experimental time interval. A customized machine learning model trained in the time-based constraint is output. The hyperparameter tuning may be performed using a modified mutating genetic algorithm for a set of hyperparameters to determine the optimized tuned hyperparameters prior to the training.

    TECHNIQUES FOR DYNAMIC TIME-BASED CUSTOM MODEL GENERATION

    公开(公告)号:US20250157210A1

    公开(公告)日:2025-05-15

    申请号:US19022830

    申请日:2025-01-15

    Abstract: Techniques are disclosed for dynamic time-based custom model generation as part of infrastructure-as-a-service (IaaS) environment. A custom model generation service may receive a set of training data and a time-based constraints for training a machine learning model. The custom model generation service may subsample the training data and generate a set of optimized tuned hyperparameters for a machine learning model to be trained using the subsampled training data. An experimental interval time of training is determined and the machine learning model is trained on the subsampled training data according to the optimized tuned hyperparameters over a set of training intervals similar to the experimental time interval. A customized machine learning model trained in the time-based constraint is output. The hyperparameter tuning may be performed using a modified mutating genetic algorithm for a set of hyperparameters to determine the optimized tuned hyperparameters prior to the training.

    Vision-based document language identification by joint supervision

    公开(公告)号:US12249170B2

    公开(公告)日:2025-03-11

    申请号:US17897055

    申请日:2022-08-26

    Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.

    Application performance monitoring for monolithic applications and distributed systems

    公开(公告)号:US12099436B2

    公开(公告)日:2024-09-24

    申请号:US18157750

    申请日:2023-01-20

    CPC classification number: G06F11/3664 G06F11/3612

    Abstract: A computing device may access a target code for implementing an application. The device may identify addresses for one or more functions or one or more variables associated with the target code. The device may generate an interval tree comprising a root node and one or more function nodes. The device may in response to the target code invoking a function or variable: generate an intercept function configured to intercept communication between the target code and a call address for the at least one of the one or more functions or the one or more variables invoked by the target code. The device may intercept data communicated between the target code and the call address. The device may store the intercepted data as a function node in the interval tree. The device may transmit the interval tree to a user device.

    MULTI-STAGE MACHINE LEARNING MODEL TRAINING FOR KEY-VALUE EXTRACTION

    公开(公告)号:US20240221407A1

    公开(公告)日:2024-07-04

    申请号:US18149795

    申请日:2023-01-04

    Abstract: Techniques for multi-stage training of a machine learning model to extract key-value pairs from documents are disclosed. A system trains a machine learning model using a set of training data including unlabeled documents of various document categories. The initial stage identifies relationships among tokens, or words, numbers, and punctuation, in documents. The system re-trains the machine learning model using a set of training data which includes a particular category of documents while excluding other categories of documents. The second training stage is a supervised machine learning stage in which the training data is labeled to identify key-value pairs in the documents. In the initial training stage, the system sets parameters of the machine learning model to an initial state. In the second stage, the system modifies the parameters of the machine learning model based on the characteristics of the training data set including the documents of the particular category.

    Techniques for graph data structure augmentation

    公开(公告)号:US11989964B2

    公开(公告)日:2024-05-21

    申请号:US17524157

    申请日:2021-11-11

    CPC classification number: G06V30/41 G06N20/00 G06V30/18181

    Abstract: A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.

    AUTOMATED GENERATION OF TRAINING DATA COMPRISING DOCUMENT IMAGES AND ASSOCIATED LABEL DATA

    公开(公告)号:US20230316792A1

    公开(公告)日:2023-10-05

    申请号:US17692844

    申请日:2022-03-11

    CPC classification number: G06V30/19147 G06N20/00 G06V30/1916

    Abstract: Techniques are described for automatically, and substantially without human intervention, generating training data where the training data includes a set of training images containing text content and associated label data. Both the training images and the associated label data are automatically generated. The label data that is automatically generated for a training image includes one or more labels identifying locations of one or more text portions within the training image, and for each text portion, a label indicative of the text content in the text portion. By automating both the generation of training images and the generation of associated label data, the techniques described herein are very scalable and repeatable and can be used to generate large amounts of training data, which in turn enables building more reliable and accurate language models.

Patent Agency Ranking