-
1.
公开(公告)号:US20240362197A1
公开(公告)日:2024-10-31
申请号:US18763909
申请日:2024-07-03
发明人: Steve Thompson , Veronika Levdik , Iurii Vymenets , Donghan Lee
IPC分类号: G06F16/22 , G06V10/70 , G06V30/412 , G06V30/413 , G06V30/414
CPC分类号: G06F16/2282 , G06V10/70 , G06V30/412 , G06V30/413 , G06V30/414
摘要: Recent developments in machine learning (commonly coined “artificial intelligence” or “AI”) have vastly expanded applications for this technology, such as myriad “chat” agents adept at understanding natural human language. While state of the art generative models can parse text queries from a user and provide comprehensive, accurate responses (including generating images depicting desired content), current implementations struggle with understanding all information present in images of documents, especially images of business documents. In particular, generative models fail to understand structured and semi-structured information, e.g., as indicated by graphical information such as lines, geometric relationships (e.g., indicated by tables, graphs, figures, etc.), formatting, and other contextual information that human readers easily and implicitly understand. The disclosed inventive concepts transform structured and semi-structured information along with textual content into a textual representation that allows generative models to better understand textual content and non-textual structured information present in document images.
-
公开(公告)号:US12124497B1
公开(公告)日:2024-10-22
申请号:US18190686
申请日:2023-03-27
申请人: Adobe Inc.
发明人: Abhinav Java , Surgan Jandial , Shripad Vilasrao Deshmukh , Milan Aggarwal , Mausoom Sarkar , Balaji Krishnamurthy , Arneh Jain
IPC分类号: G06F16/383 , G06F16/332 , G06V30/19 , G06V30/412
CPC分类号: G06F16/383 , G06F16/332 , G06V30/19147 , G06V30/412
摘要: Form structure similarity detection techniques are described. A content processing system, for instance, receives a query snippet that depicts a query form structure. The content processing system generates a query layout string that includes semantic indicators to represent the query form structure and generates candidate layout strings that represent form structures from a target document. The content processing system calculates similarity scores between the query layout string and the candidate layout strings. Based on the similarity scores, the content processing system generates a target snippet for display that depicts a form structure that is structurally similar to the query form structure. The content processing system is further operable to generate a training dataset that includes image pairs of snippets depicting form structures that are structurally similar. The content processing system utilizes the training dataset to train a machine learning model to perform form structure similarity matching.
-
3.
公开(公告)号:US12100231B2
公开(公告)日:2024-09-24
申请号:US17178705
申请日:2021-02-18
IPC分类号: G06V30/26 , G06F40/174 , G06T7/00 , G06T11/40 , G06V30/10 , G06V30/412
CPC分类号: G06V30/26 , G06F40/174 , G06T7/0002 , G06V30/10 , G06V30/412 , G06T11/40 , G06T2207/10008 , G06T2207/30176
摘要: An information processing apparatus includes a processor configured to acquire recognized items obtained by recognizing characters in a first image showing a reading target including one or more first fields in which characters are written and a second field in which a name or a signature is written, if any one of the first fields is inadequate, create a second image having blank fields for the inadequate first field and the second field, if the second field is inadequate, create a second image having a blank field for the second field, and output the created second image.
-
公开(公告)号:US12087070B2
公开(公告)日:2024-09-10
申请号:US17454729
申请日:2021-11-12
发明人: Jenna Hong , Apurva Sandeep Gandhi , Gilbert Antonius , Tra My Nguyen , Ryan Serrao , Biyi Fang , Sheng Yi
IPC分类号: G06F40/284 , G06V10/22 , G06V30/32 , G06V30/412
CPC分类号: G06V30/36 , G06F40/284 , G06V10/22 , G06V30/333 , G06V30/412
摘要: A computer system is provided that includes one or more processors configured to receive user input for inked content to a digital canvas, and process the inked content to determine one or more writing regions. Each writing region includes recognized text and one or more document layout features associated with that writing region. The one or more processors are further configured to tokenize a target writing region of the one or more writing regions into a sequence of tokens, process the sequence of tokens of the target writing region using a task extraction subsystem that operates on tokens representing both the recognized text and the one or more document layout features of the target writing region, segment the target writing region into one or more sentence segments, and classify each of the one or more sentence segments as a task sentence or a non-task sentence.
-
公开(公告)号:US20240296690A1
公开(公告)日:2024-09-05
申请号:US18664807
申请日:2024-05-15
申请人: ZenPayroll, Inc.
IPC分类号: G06V30/412 , G06V10/24 , G06V10/48 , G06V30/414 , G06V30/416
CPC分类号: G06V30/412 , G06V10/242 , G06V10/48 , G06V30/414 , G06V30/416
摘要: An image processing system accesses an image of a completed form document. The image of the form document includes one or more features, such as form text, at particular locations within the image. The image processing system accesses a template of the form document and computes a rotation and zoom of the image of the form document relative to the template of the form document based on the locations of the features within the image of the form document relative to the locations of the corresponding features within the template of the form document. The image processing system performs a rotation operation and a zoom operation on the image of the form document, and extracts data entered into fields of the modified image of the form document. The extracted data can be then accessed or stored for subsequent use.
-
6.
公开(公告)号:US12080091B2
公开(公告)日:2024-09-03
申请号:US18331990
申请日:2023-06-09
申请人: Open Text SA ULC
IPC分类号: G06F17/00 , G06F16/22 , G06F16/25 , G06F16/93 , G06F18/21 , G06F40/174 , G06F40/177 , G06F40/186 , G06F40/216 , G06F40/274 , G06N20/00 , G06V30/19 , G06V30/412 , G06V30/414 , G06V30/416
CPC分类号: G06V30/416 , G06F16/2282 , G06F16/258 , G06F16/93 , G06F18/217 , G06F40/174 , G06F40/177 , G06F40/186 , G06F40/216 , G06F40/274 , G06N20/00 , G06V30/1916 , G06V30/412 , G06V30/414
摘要: A bipartite application implements a table auto-completion (TAC) algorithm on the client side and the server side. A client module runs a local model of the TAC algorithm on a user device and a server module runs a global model of the TAC algorithm on a server machine. The local model is continuously adapted through on-the-fly training, with as few as one negative example, to perform TAC on the client side, one document at a time. Knowledge thus learned by the local model is used to improve the global model on the server side. The global model can be utilized to automatically and intelligently extract table information from a large number of documents with significantly improved accuracy, requiring minimal human intervention even on complex tables.
-
公开(公告)号:US12067039B1
公开(公告)日:2024-08-20
申请号:US18327636
申请日:2023-06-01
申请人: Instabase, Inc.
发明人: Jessica Andersen Campos , Eric Han , Hariharan Thirugnanam , Subash Chandran Thirumaran , Timothy Serkes , Alagu Chockalingam , Varun Jain
IPC分类号: G06F16/00 , G06F16/332 , G06F16/35 , G06V30/412
CPC分类号: G06F16/3328 , G06F16/358 , G06V30/412
摘要: Systems and methods for providing user interfaces for configuration of a flow for extracting information from documents via a large language model are disclosed. Exemplary implementations may: present a user interface configured to obtain entry of user input from a user to select a set of exemplary documents; select one or more document classifications for the set of exemplary documents; select one or more extraction fields that correspond to individual queries; navigate between different portions of the user interface; present the set of document classifications; present a particular individual document in the user interface; present a set of extraction fields in the user interface, wherein the individual extraction fields present individual replies obtained from the large language model in reply to the individual queries; and/or perform other steps.
-
公开(公告)号:US20240265721A1
公开(公告)日:2024-08-08
申请号:US18369744
申请日:2023-09-18
IPC分类号: G06V30/40 , G06F18/24 , G06V10/22 , G06V10/98 , G06V30/412
CPC分类号: G06V30/40 , G06F18/24 , G06V10/225 , G06V10/993 , G06V30/412
摘要: Methods for detecting digital or physical tampering of an imaged physical credential include the actions of: receiving a digital image representing a physical credential having one or more high value regions, the digital image including an array of pixels; processing the digital image with a tamper detector to generate an output corresponding to an intrinsic characteristic of the digital image, the tamper detector configured to perform a pixel-level analysis of the high value regions of the digital image with respect to a predetermined tampering signature; and determining, based on the output from the tamper detector, whether the digital image has been digitally tampered with.
-
公开(公告)号:US20240257550A1
公开(公告)日:2024-08-01
申请号:US18686233
申请日:2022-08-25
申请人: Google LLC
IPC分类号: G06V30/416 , G06V10/44 , G06V10/82 , G06V30/412
CPC分类号: G06V30/416 , G06V10/44 , G06V10/82 , G06V30/412
摘要: A method including receiving an image representing a document including a plurality of layout components, identifying textual information associated with the plurality of layout components, identifying visual information associated with the plurality of layout components, combining the textual information with the visual information, and predicting a reading order of the plurality of layout components based on the combined textual information and visual information using a self-attention encoder/decoder.
-
公开(公告)号:US20240249543A1
公开(公告)日:2024-07-25
申请号:US18100334
申请日:2023-01-23
申请人: SAP SE
IPC分类号: G06V30/412 , G06V10/82 , G06V30/19 , G06V30/413
CPC分类号: G06V30/412 , G06V10/82 , G06V30/19147 , G06V30/413
摘要: Example methods and systems are directed to data extraction from input data objects. A data extraction schema may define at least a first feature and a second feature. The first feature may be associated with a first machine learning model and the second feature may be associated with a second machine learning model. A first input data object is accessed. The first machine learning model may be used to extract a first output data object associated with the first feature from the first input data object, based on the data extraction schema. The second machine learning model may be used to extract a second output data object associated with the second feature from the first input data object, based on the data extraction schema. The first output data object and the second output data object may be presented on a user interface.
-
-
-
-
-
-
-
-
-