-
公开(公告)号:US12106595B2
公开(公告)日:2024-10-01
申请号:US18379091
申请日:2023-10-11
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Kulbhushan Pachauri
IPC: G06V30/414 , G06V30/19
CPC classification number: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
Abstract: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US20240289551A1
公开(公告)日:2024-08-29
申请号:US18240480
申请日:2023-08-31
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Srikant Panda , Deepak Karmakar , Kulbhushan Pachauri
IPC: G06F40/284
CPC classification number: G06F40/284
Abstract: In some implementations, techniques described herein may include identifying text in a visually rich document and determining a sequence for the identified text. The techniques may include selecting a language model based at least in part on the identified text and the determined sequence. Moreover, the techniques may include assigning each word of the identified text to a respective token to generate textual features corresponding to the identified text. The techniques may include extracting visual features corresponding to the identified text. The techniques may include determining positional features for each word of the identified text. The techniques may include generating a graph representing the visually rich document, each node in the graph representing each of the visual features, textual features, and positional features of a respective word of the identified text. The techniques may include training a classifier on the graph to classify each respective word of the identified text.
-
公开(公告)号:US20250036962A1
公开(公告)日:2025-01-30
申请号:US18406905
申请日:2024-01-08
Applicant: Oracle International Corporation
Inventor: Edwin Thomas , Amit Agarwal , Sandeep Jana , Kulbhushan Pachauri
IPC: G06N3/098
Abstract: Techniques are described herein for generating block extender model. An example method includes a system accessing a base model trained for identifying a base class. The system can access an extender comprising block extenders, the extender class distinct from the base class. The system can connect the extender with the base model to generate an augmented model. The system can input training data to the augmented model, the training data is provided to the base model and the extender, the training data comprising data points associated with the extender class. The system can train the extender model to identify the extender class based at least in part on the training data and the signal received from the base machine learning model. The system can generate a trained extender based at least in part on the training, the extender trained to identify an object associated with the extender class.
-
公开(公告)号:US11989964B2
公开(公告)日:2024-05-21
申请号:US17524157
申请日:2021-11-11
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Kulbhushan Pachauri , Iman Zadeh , Jun Qian
CPC classification number: G06V30/41 , G06N20/00 , G06V30/18181
Abstract: A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.
-
公开(公告)号:US11823478B2
公开(公告)日:2023-11-21
申请号:US17714806
申请日:2022-04-06
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Kulbhushan Pachauri
IPC: G06V30/414 , G06V30/19
CPC classification number: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
Abstract: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US20240144081A1
公开(公告)日:2024-05-02
申请号:US18051419
申请日:2022-10-31
Applicant: Oracle International Corporation
Inventor: Sandeep Jana , Edwin Thomas , Kulbhushan Pachauri
IPC: G06N20/00 , G06V10/774
CPC classification number: G06N20/00 , G06V10/774
Abstract: Continual learning techniques are described for extending the capabilities of a base model, which is trained to predict a set of existing or base classes, to generate a target model that is capable of making predictions for both the existing or base classes and additionally for making predictions for new or custom classes. The techniques described herein enable the target model to be trained such that the model can make predictions involving both base classes and custom classes with high levels of accuracy.
-
公开(公告)号:US20240005640A1
公开(公告)日:2024-01-04
申请号:US17994712
申请日:2022-11-28
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Srikant Panda , Kulbhushan Pachauri
IPC: G06V10/774 , G06V30/414 , G06V30/413
CPC classification number: G06V10/774 , G06V30/413 , G06V30/414
Abstract: Embodiments described herein are directed towards a synthetic document generation pipeline for training artificial intelligence models. One embodiment includes a method including a device that receives an instruction to generate a document to be used as a training instance for a first machine learning model, the instruction including an element configuration, a document class configuration, a format configuration, an augmentation configuration, and data bias and fairness. The device can receive an element from an interface based at least in part on the element configuration, the element can simulate a real-world image, real-world text, or real-world machine-readable visual code. The device can generate metadata describe a layout for the element on the document based on the document class configuration. The device can generate the document by arranging the element on the document based on the metadata, wherein the document is generated in a format based on the format configuration.
-
公开(公告)号:US20250078556A1
公开(公告)日:2025-03-06
申请号:US18240344
申请日:2023-08-30
Applicant: Oracle International Corporation
Inventor: Srikant Panda , Amit Agarwal , Kulbhushan Pachauri
IPC: G06V30/412 , G06F40/169
Abstract: A method may include detecting one or more selection boxes and one or more text lines in a primary document. The method may include determining respective vectors associated with the selection box and adjacent text lines to the selection box in a plurality of directions. The method may include determining a set of respective vectors associated with a unique selection box. The method may include determining a variance between respective vectors in the set of respective vectors and identifying a particular direction corresponding to a minimal variance between the respective vectors in the set of respective vectors as compared to a variance of other sets of respective vectors. The method may include generating a key-value pair based on the set of respective vectors characterized by the minimal variance. The method may include generating a document model, including the key-value pair, and extracting data according to the document model.
-
公开(公告)号:US20250078555A1
公开(公告)日:2025-03-06
申请号:US18240343
申请日:2023-08-30
Applicant: Oracle International Corporation
Inventor: Amit Agarwal , Srikant Panda , Kulbhushan Pachauri
IPC: G06V30/412 , G06V30/19 , G06V30/413
Abstract: A method may include receiving a primary document including one or more selection boxes, one or more text lines, and one or more annotations. The method may include determining, a class based on the annotations. The method may include identifying the one or more selection boxes and one or more text lines of the primary document. The method may include generating a graph representing the one or more selection boxes and the one or more text lines. The method may include mapping each of the one or more selection boxes to a respective text line of the one or more text lines of the graph based at least in part on one or more characteristics associated with the selection boxes. The method may include generating a key-value pair associated with each of the one or more text lines and generating a document model of the primary document.
-
公开(公告)号:US20250014374A1
公开(公告)日:2025-01-09
申请号:US18347983
申请日:2023-07-06
Applicant: Oracle International Corporation
Inventor: Srikant Panda , Amit Agarwal , Gouttham Nambirajan , Kulbhushan Pachauri
IPC: G06V30/19 , G06F40/169 , G06F40/247 , G06V30/413
Abstract: Techniques for extracting information from unstructured documents that enable an ML model to be trained such that the model can accurately distinguish in-distribution (“in-D”) elements and out-of-distribution (“OO-D”) elements within an unstructured document. Novel training techniques are used that train an ML model using a combination of a regular training dataset and an enhanced augmented training dataset. The regular training dataset is used to train an ML model to identify in-D elements, i.e., to classify an element extracted from a document as belonging to one of the in-D classes contained in the regular training dataset. The augmented training dataset, which is generated based upon the regular training dataset may contain one or more augmented elements which are used to train the model to identify OO-D elements, i.e., to classify an augmented element extracted from a document as belonging to an OO-D class instead of to an in-D class.
-
-
-
-
-
-
-
-
-