Pseudo labelling for key-value extraction from documents

    公开(公告)号:US12106595B2

    公开(公告)日:2024-10-01

    申请号:US18379091

    申请日:2023-10-11

    CPC classification number: G06V30/414 G06V30/19147 G06V30/19173 G06V30/19187

    Abstract: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.

    DOMAIN ADAPTING GRAPH NETWORKS FOR VISUALLY RICH DOCUMENTS

    公开(公告)号:US20240289551A1

    公开(公告)日:2024-08-29

    申请号:US18240480

    申请日:2023-08-31

    CPC classification number: G06F40/284

    Abstract: In some implementations, techniques described herein may include identifying text in a visually rich document and determining a sequence for the identified text. The techniques may include selecting a language model based at least in part on the identified text and the determined sequence. Moreover, the techniques may include assigning each word of the identified text to a respective token to generate textual features corresponding to the identified text. The techniques may include extracting visual features corresponding to the identified text. The techniques may include determining positional features for each word of the identified text. The techniques may include generating a graph representing the visually rich document, each node in the graph representing each of the visual features, textual features, and positional features of a respective word of the identified text. The techniques may include training a classifier on the graph to classify each respective word of the identified text.

    MODEL AUGMENTATION FRAMEWORK FOR DOMAIN ASSISTED CONTINUAL LEARNING IN DEEP LEARNING

    公开(公告)号:US20250036962A1

    公开(公告)日:2025-01-30

    申请号:US18406905

    申请日:2024-01-08

    Abstract: Techniques are described herein for generating block extender model. An example method includes a system accessing a base model trained for identifying a base class. The system can access an extender comprising block extenders, the extender class distinct from the base class. The system can connect the extender with the base model to generate an augmented model. The system can input training data to the augmented model, the training data is provided to the base model and the extender, the training data comprising data points associated with the extender class. The system can train the extender model to identify the extender class based at least in part on the training data and the signal received from the base machine learning model. The system can generate a trained extender based at least in part on the training, the extender trained to identify an object associated with the extender class.

    Techniques for graph data structure augmentation

    公开(公告)号:US11989964B2

    公开(公告)日:2024-05-21

    申请号:US17524157

    申请日:2021-11-11

    CPC classification number: G06V30/41 G06N20/00 G06V30/18181

    Abstract: A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.

    Pseudo labelling for key-value extraction from documents

    公开(公告)号:US11823478B2

    公开(公告)日:2023-11-21

    申请号:US17714806

    申请日:2022-04-06

    CPC classification number: G06V30/414 G06V30/19147 G06V30/19173 G06V30/19187

    Abstract: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.

    CONTINUAL LEARNING TECHNIQUES FOR TRAINING MODELS

    公开(公告)号:US20240144081A1

    公开(公告)日:2024-05-02

    申请号:US18051419

    申请日:2022-10-31

    CPC classification number: G06N20/00 G06V10/774

    Abstract: Continual learning techniques are described for extending the capabilities of a base model, which is trained to predict a set of existing or base classes, to generate a target model that is capable of making predictions for both the existing or base classes and additionally for making predictions for new or custom classes. The techniques described herein enable the target model to be trained such that the model can make predictions involving both base classes and custom classes with high levels of accuracy.

    SYNTHETIC DOCUMENT GENERATION PIPELINE FOR TRAINING ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20240005640A1

    公开(公告)日:2024-01-04

    申请号:US17994712

    申请日:2022-11-28

    CPC classification number: G06V10/774 G06V30/413 G06V30/414

    Abstract: Embodiments described herein are directed towards a synthetic document generation pipeline for training artificial intelligence models. One embodiment includes a method including a device that receives an instruction to generate a document to be used as a training instance for a first machine learning model, the instruction including an element configuration, a document class configuration, a format configuration, an augmentation configuration, and data bias and fairness. The device can receive an element from an interface based at least in part on the element configuration, the element can simulate a real-world image, real-world text, or real-world machine-readable visual code. The device can generate metadata describe a layout for the element on the document based on the document class configuration. The device can generate the document by arranging the element on the document based on the metadata, wherein the document is generated in a format based on the format configuration.

    TECHNIQUES OF INFORMATION EXTRACTION FOR SELECTION MARKS

    公开(公告)号:US20250078556A1

    公开(公告)日:2025-03-06

    申请号:US18240344

    申请日:2023-08-30

    Abstract: A method may include detecting one or more selection boxes and one or more text lines in a primary document. The method may include determining respective vectors associated with the selection box and adjacent text lines to the selection box in a plurality of directions. The method may include determining a set of respective vectors associated with a unique selection box. The method may include determining a variance between respective vectors in the set of respective vectors and identifying a particular direction corresponding to a minimal variance between the respective vectors in the set of respective vectors as compared to a variance of other sets of respective vectors. The method may include generating a key-value pair based on the set of respective vectors characterized by the minimal variance. The method may include generating a document model, including the key-value pair, and extracting data according to the document model.

    TECHNIQUES OF INFORMATION EXTRACTION FOR SELECTION MARKS

    公开(公告)号:US20250078555A1

    公开(公告)日:2025-03-06

    申请号:US18240343

    申请日:2023-08-30

    Abstract: A method may include receiving a primary document including one or more selection boxes, one or more text lines, and one or more annotations. The method may include determining, a class based on the annotations. The method may include identifying the one or more selection boxes and one or more text lines of the primary document. The method may include generating a graph representing the one or more selection boxes and the one or more text lines. The method may include mapping each of the one or more selection boxes to a respective text line of the one or more text lines of the graph based at least in part on one or more characteristics associated with the selection boxes. The method may include generating a key-value pair associated with each of the one or more text lines and generating a document model of the primary document.

    OUT OF DISTRIBUTION ELEMENT DETECTION FOR INFORMATION EXTRACTION

    公开(公告)号:US20250014374A1

    公开(公告)日:2025-01-09

    申请号:US18347983

    申请日:2023-07-06

    Abstract: Techniques for extracting information from unstructured documents that enable an ML model to be trained such that the model can accurately distinguish in-distribution (“in-D”) elements and out-of-distribution (“OO-D”) elements within an unstructured document. Novel training techniques are used that train an ML model using a combination of a regular training dataset and an enhanced augmented training dataset. The regular training dataset is used to train an ML model to identify in-D elements, i.e., to classify an element extracted from a document as belonging to one of the in-D classes contained in the regular training dataset. The augmented training dataset, which is generated based upon the regular training dataset may contain one or more augmented elements which are used to train the model to identify OO-D elements, i.e., to classify an augmented element extracted from a document as belonging to an OO-D class instead of to an in-D class.

Patent Agency Ranking