-
公开(公告)号:US20250094866A1
公开(公告)日:2025-03-20
申请号:US18678914
申请日:2024-05-30
Applicant: Oracle International Corporation
Inventor: Zheng Wang , Yazhe Hu , Mengqing Guo , Tao Sheng , Jun Qian , Vinod Murli Mamtani
IPC: G06N20/00
Abstract: Techniques for correcting hallucinations produced by generative large language models (LLMs). In one technique, a computing system accesses first output generated by an LLM. The computing system identifies, within the first output, a plurality of assertions. The computing system determines that a first assertion in the plurality of assertions is false. The computing system generates a prompt that indicates that the first assertion is false. The computing system submits the prompt as input to the LLM. The computing system accesses second output that is generated by the LLM, where the second output includes a second assertion that is different than the first assertion and corresponds to the first assertion.
-
12.
公开(公告)号:US20250094814A1
公开(公告)日:2025-03-20
申请号:US18824570
申请日:2024-09-04
Applicant: Oracle International Corporation
Inventor: Zheng Wang , Yazhe Hu , Mengqing Guo , Tao Sheng , Jun Qian , Vinod M Mamtani
IPC: G06N3/0895
Abstract: Techniques are provided for fine-tuning large language models (LLMs) to reduce the instability of LLM outputs to prompt. In one technique, a plurality of prompts is stored. For each prompt of the plurality of prompts, a plurality of variants of that prompt is generated. A prompt generating LLM is fine-tuned based on that prompt and the plurality of variants. Each variant-prompt association (where the variant is generated based on the prompt and has an identical or similar meaning) is a training sample that is used to train or fine-tune the prompt generating LLM. The prompt generating LLM is configured to generate standardized prompts based on input prompts. In another technique, a response generating LLM is fine-tuned based on sets of training samples, each training sample in a set comprising a different variant of a prompt and a response that the response generating LLM generated based on the prompt.
-
公开(公告)号:US12230020B2
公开(公告)日:2025-02-18
申请号:US17586583
申请日:2022-01-27
Applicant: Oracle International Corporation
Inventor: Olaitan Olaleye , Arunjeyan T V Seshier Venkatachalapathy , Jinghou Zhang , Jun Qian
IPC: G06V10/82 , G06N3/086 , G06V10/774 , G06V10/778
Abstract: Techniques are disclosed for dynamic time-based custom model generation as part of infrastructure-as-a-service (IaaS) environment. A custom model generation service may receive a set of training data and a time-based constraints for training a machine learning model. The custom model generation service may subsample the training data and generate a set of optimized tuned hyperparameters for a machine learning model to be trained using the subsampled training data. An experimental interval time of training is determined and the machine learning model is trained on the subsampled training data according to the optimized tuned hyperparameters over a set of training intervals similar to the experimental time interval. A customized machine learning model trained in the time-based constraint is output. The hyperparameter tuning may be performed using a modified mutating genetic algorithm for a set of hyperparameters to determine the optimized tuned hyperparameters prior to the training.
-
公开(公告)号:US20230237787A1
公开(公告)日:2023-07-27
申请号:US17586583
申请日:2022-01-27
Applicant: Oracle International Corporation
Inventor: Olaitan Olaleye , Arunjeyan T V Seshier Venkatachalapathy , Jinghou Zhang , Jun Qian
IPC: G06V10/82 , G06N3/08 , G06V10/774 , G06V10/778
CPC classification number: G06V10/82 , G06N3/086 , G06V10/774 , G06V10/7788
Abstract: Techniques are disclosed for dynamic time-based custom model generation as part of infrastructure-as-a-service (IaaS) environment. A custom model generation service may receive a set of training data and a time-based constraints for training a machine learning model. The custom model generation service may subsample the training data and generate a set of optimized tuned hyperparameters for a machine learning model to be trained using the subsampled training data. An experimental interval time of training is determined and the machine learning model is trained on the subsampled training data according to the optimized tuned hyperparameters over a set of training intervals similar to the experimental time interval. A customized machine learning model trained in the time-based constraint is output. The hyperparameter tuning may be performed using a modified mutating genetic algorithm for a set of hyperparameters to determine the optimized tuned hyperparameters prior to the training.
-
公开(公告)号:US20250157210A1
公开(公告)日:2025-05-15
申请号:US19022830
申请日:2025-01-15
Applicant: Oracle International Corporation
Inventor: Olaitan Olaleye , Arunjeyan T V Seshier Venkatachalapathy , Jinghou Zhang , Jun Qian
IPC: G06V10/82 , G06N3/086 , G06V10/774 , G06V10/778
Abstract: Techniques are disclosed for dynamic time-based custom model generation as part of infrastructure-as-a-service (IaaS) environment. A custom model generation service may receive a set of training data and a time-based constraints for training a machine learning model. The custom model generation service may subsample the training data and generate a set of optimized tuned hyperparameters for a machine learning model to be trained using the subsampled training data. An experimental interval time of training is determined and the machine learning model is trained on the subsampled training data according to the optimized tuned hyperparameters over a set of training intervals similar to the experimental time interval. A customized machine learning model trained in the time-based constraint is output. The hyperparameter tuning may be performed using a modified mutating genetic algorithm for a set of hyperparameters to determine the optimized tuned hyperparameters prior to the training.
-
公开(公告)号:US12249170B2
公开(公告)日:2025-03-11
申请号:US17897055
申请日:2022-08-26
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian
IPC: G06F40/263 , G06V10/82 , G06V30/246
Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.
-
公开(公告)号:US12099436B2
公开(公告)日:2024-09-24
申请号:US18157750
申请日:2023-01-20
Applicant: Oracle International Corporation
Inventor: Fuheng Wu , Ivan Dimitrov Davchev , Jun Qian
CPC classification number: G06F11/3664 , G06F11/3612
Abstract: A computing device may access a target code for implementing an application. The device may identify addresses for one or more functions or one or more variables associated with the target code. The device may generate an interval tree comprising a root node and one or more function nodes. The device may in response to the target code invoking a function or variable: generate an intercept function configured to intercept communication between the target code and a call address for the at least one of the one or more functions or the one or more variables invoked by the target code. The device may intercept data communicated between the target code and the call address. The device may store the intercepted data as a function node in the interval tree. The device may transmit the interval tree to a user device.
-
公开(公告)号:US20240221407A1
公开(公告)日:2024-07-04
申请号:US18149795
申请日:2023-01-04
Applicant: Oracle International Corporation
Inventor: Yazhe Hu , Jeaff Wang , Mengqing Guo , Tao Sheng , Jun Qian
IPC: G06V30/19 , G06F40/284 , G06F40/30 , G06N3/08 , G06V30/14
CPC classification number: G06V30/19147 , G06F40/284 , G06F40/30 , G06N3/08 , G06V30/1448
Abstract: Techniques for multi-stage training of a machine learning model to extract key-value pairs from documents are disclosed. A system trains a machine learning model using a set of training data including unlabeled documents of various document categories. The initial stage identifies relationships among tokens, or words, numbers, and punctuation, in documents. The system re-trains the machine learning model using a set of training data which includes a particular category of documents while excluding other categories of documents. The second training stage is a supervised machine learning stage in which the training data is labeled to identify key-value pairs in the documents. In the initial training stage, the system sets parameters of the machine learning model to an initial state. In the second stage, the system modifies the parameters of the machine learning model based on the characteristics of the training data set including the documents of the particular category.
-
公开(公告)号:US11989964B2
公开(公告)日:2024-05-21
申请号:US17524157
申请日:2021-11-11
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Kulbhushan Pachauri , Iman Zadeh , Jun Qian
CPC classification number: G06V30/41 , G06N20/00 , G06V30/18181
Abstract: A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.
-
20.
公开(公告)号:US20230316792A1
公开(公告)日:2023-10-05
申请号:US17692844
申请日:2022-03-11
Applicant: Oracle International Corporation
Inventor: Yazhe Hu , Yuying Wang , Liyu Gong , Iman Zadeh , Jun Qian
CPC classification number: G06V30/19147 , G06N20/00 , G06V30/1916
Abstract: Techniques are described for automatically, and substantially without human intervention, generating training data where the training data includes a set of training images containing text content and associated label data. Both the training images and the associated label data are automatically generated. The label data that is automatically generated for a training image includes one or more labels identifying locations of one or more text portions within the training image, and for each text portion, a label indicative of the text content in the text portion. By automating both the generation of training images and the generation of associated label data, the techniques described herein are very scalable and repeatable and can be used to generate large amounts of training data, which in turn enables building more reliable and accurate language models.
-
-
-
-
-
-
-
-
-