AUTOMATED TRANSFORMATION OF INFORMATION FROM IMAGES TO TEXTUAL REPRESENTATIONS, AND APPLICATIONS THEREFOR

    公开(公告)号:US20240362197A1

    公开(公告)日:2024-10-31

    申请号:US18763909

    申请日:2024-07-03

    摘要: Recent developments in machine learning (commonly coined “artificial intelligence” or “AI”) have vastly expanded applications for this technology, such as myriad “chat” agents adept at understanding natural human language. While state of the art generative models can parse text queries from a user and provide comprehensive, accurate responses (including generating images depicting desired content), current implementations struggle with understanding all information present in images of documents, especially images of business documents. In particular, generative models fail to understand structured and semi-structured information, e.g., as indicated by graphical information such as lines, geometric relationships (e.g., indicated by tables, graphs, figures, etc.), formatting, and other contextual information that human readers easily and implicitly understand. The disclosed inventive concepts transform structured and semi-structured information along with textual content into a textual representation that allows generative models to better understand textual content and non-textual structured information present in document images.

    Form structure similarity detection

    公开(公告)号:US12124497B1

    公开(公告)日:2024-10-22

    申请号:US18190686

    申请日:2023-03-27

    申请人: Adobe Inc.

    摘要: Form structure similarity detection techniques are described. A content processing system, for instance, receives a query snippet that depicts a query form structure. The content processing system generates a query layout string that includes semantic indicators to represent the query form structure and generates candidate layout strings that represent form structures from a target document. The content processing system calculates similarity scores between the query layout string and the candidate layout strings. Based on the similarity scores, the content processing system generates a target snippet for display that depicts a form structure that is structurally similar to the query form structure. The content processing system is further operable to generate a training dataset that includes image pairs of snippets depicting form structures that are structurally similar. The content processing system utilizes the training dataset to train a machine learning model to perform form structure similarity matching.

    DATA EXTRACTION FROM FORM IMAGES
    5.
    发明公开

    公开(公告)号:US20240296690A1

    公开(公告)日:2024-09-05

    申请号:US18664807

    申请日:2024-05-15

    申请人: ZenPayroll, Inc.

    摘要: An image processing system accesses an image of a completed form document. The image of the form document includes one or more features, such as form text, at particular locations within the image. The image processing system accesses a template of the form document and computes a rotation and zoom of the image of the form document relative to the template of the form document based on the locations of the features within the image of the form document relative to the locations of the corresponding features within the template of the form document. The image processing system performs a rotation operation and a zoom operation on the image of the form document, and extracts data entered into fields of the modified image of the form document. The extracted data can be then accessed or stored for subsequent use.

    CUSTOMIZABLE DATA EXTRACTION SERVICE
    10.
    发明公开

    公开(公告)号:US20240249543A1

    公开(公告)日:2024-07-25

    申请号:US18100334

    申请日:2023-01-23

    申请人: SAP SE

    摘要: Example methods and systems are directed to data extraction from input data objects. A data extraction schema may define at least a first feature and a second feature. The first feature may be associated with a first machine learning model and the second feature may be associated with a second machine learning model. A first input data object is accessed. The first machine learning model may be used to extract a first output data object associated with the first feature from the first input data object, based on the data extraction schema. The second machine learning model may be used to extract a second output data object associated with the second feature from the first input data object, based on the data extraction schema. The first output data object and the second output data object may be presented on a user interface.