-
公开(公告)号:US11657631B2
公开(公告)日:2023-05-23
申请号:US17243467
申请日:2021-04-28
Applicant: ONFIDO LTD
Inventor: Christos Sagonas , Karolina Dabkowska , Zhiyuan Shi , Edward Fieri Soler , Mohan Mahadevan , Iona Grace Vincent , Luca Peric , Alessandro Lenzi , Alvaro Fernando Lara , James Stonehill
IPC: G06T7/30 , G06T7/136 , G06T7/11 , G06T7/70 , G06K9/62 , G06T3/40 , G06V10/22 , G06V30/414 , G06V30/10
CPC classification number: G06T7/30 , G06K9/6256 , G06K9/6267 , G06T3/40 , G06T7/11 , G06T7/136 , G06T7/70 , G06V10/22 , G06V30/414 , G06T2207/20132 , G06T2207/30176 , G06V30/10
Abstract: A computer-implemented method for extracting information from a document, for example an official document, is disclosed. The method comprises acquiring an input image comprising a document portion; performing image segmentation on the input image to form a binary input image that distinguishes the document portion from the remaining portion of the input image; estimating a first image transform to align the binary input image to a binary template image, using the first image transform on the input image to form an intermediate image; estimating a second image transform to align the intermediate image to a template image; using the second image transform on the intermediate image to form an output image; and extracting a field from the output image using a predetermined field of the template image.
-
公开(公告)号:US20240362397A1
公开(公告)日:2024-10-31
申请号:US18655781
申请日:2024-05-06
Applicant: iCIMS, Inc.
Inventor: Eoin O'GORMAN , Adrian MIHAI
IPC: G06F40/103 , G06F18/21 , G06F18/214 , G06F40/169 , G06F40/197 , G06N20/00 , G06Q10/1053 , G06V30/414
CPC classification number: G06F40/103 , G06F18/214 , G06F18/217 , G06F40/169 , G06F40/197 , G06N20/00 , G06Q10/1053 , G06V30/414 , G06V2201/13
Abstract: In some embodiments, a method can include generating a resume document image having a standardized format, based on a resume document having a set of paragraphs. The method can further include executing a statistical model to generate an annotated resume document image from the resume document image. The annotated resume document image can indicate a bounding box and a paragraph type, for a paragraph from a set of paragraphs of the annotated resume document image. The method can further include identifying a block of text in the resume document corresponding to the paragraph of the annotated resume document image. The method can further include extracting the block of text from the resume document and associating the paragraph type to the block of text.
-
公开(公告)号:US20240355134A1
公开(公告)日:2024-10-24
申请号:US18761274
申请日:2024-07-01
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yongshuai Huang , Ning Lu , Lin Du
IPC: G06V30/412 , G06F40/143 , G06V10/98 , G06V30/414
CPC classification number: G06V30/412 , G06F40/143 , G06V10/98 , G06V30/414
Abstract: In a data processing method, a processing device obtains a to-be-processed table image, and determines a table recognition result based on the table image and a generative table recognition policy. The generative table recognition policy indicates that the table recognition result of the table image is to determine using a markup language and a non-overlapping attribute of a bounding box. The bounding box indicates a position of a text included in a cell in a table associated with the table image, and the table recognition result indicates a global structure and content that are included in the table. The processing device then outputs the table recognition result.
-
公开(公告)号:US20240354517A1
公开(公告)日:2024-10-24
申请号:US18640717
申请日:2024-04-19
Applicant: The MITRE Corporation
Inventor: Luther Karl Branting , Bradford Clement Brown , Kenneth Jeffrey Harrold , Sarah Maureen Howell , Christopher Mario Giannella , James Antony Van Guilder
IPC: G06F40/40 , G06F16/93 , G06F40/205 , G06F40/279 , G06V30/414
CPC classification number: G06F40/40 , G06F40/205 , G06F40/279 , G06V30/414 , G06F16/93
Abstract: A method for providing suggested text redactions for a document, includes receiving, from a user, the document comprising text; extracting the text from the document; parsing the extracted text into a plurality of identified text sentences; inputting the plurality of identified text sentences into one or more trained artificial intelligence models that have been trained on labeled text sentences to generate a set of suggested text redactions for the plurality of identified text sentences; and providing the set of suggested text redactions for the plurality of identified text sentences to the user.
-
公开(公告)号:US20240320995A1
公开(公告)日:2024-09-26
申请号:US18124392
申请日:2023-03-21
Applicant: JPMorgan Chase Bank, N.A.
Inventor: Nancy THOMAS , Daniel BORRAJO
IPC: G06V30/414 , G06F40/194 , G06F40/205
CPC classification number: G06V30/414 , G06F40/194 , G06F40/205
Abstract: A method for facilitating electronic textual representation and comparison is disclosed. The method includes receiving, via a graphical user interface, a comparison request that includes a first electronic document and a second electronic document; parsing the first electronic document and the second electronic document to classify textual data; generating, by using the classified textual data, a first tree structure for the first electronic document and a second tree structure for the second electronic document; constructing a first hierarchy dictionary for the first tree structure and a second hierarchy dictionary for the second tree structure; determining differences between the first electronic document and the second electronic document by using the first tree structure, the first hierarchy dictionary, the second tree structure, and the second hierarchy dictionary; and generating graphical representations that depicts the differences and textual representations that summarize the differences.
-
86.
公开(公告)号:US20240311555A1
公开(公告)日:2024-09-19
申请号:US18671083
申请日:2024-05-22
Inventor: Alexander Gataric
IPC: G06F40/186 , G06F16/35 , G06V30/10 , G06V30/414
CPC classification number: G06F40/186 , G06F16/355 , G06V30/10 , G06V30/414
Abstract: A template generation system for generating document templates from a mixed set of document types including a template generation server programmed to receive a batch of documents, identify a plurality of text blocks, generate a plurality of clusters, generate a plurality of document arrays corresponding to the plurality of clusters, and compare each document array to each other document array to determine a percentage match. When the percentage match between two or more frameworks exceeds a threshold, the template generation system defines a subset of documents, and for each subset of documents, template generation system generates a template for the subset of documents. The template is a collection of the text blocks that are commonly included in each of the documents of the subset.
-
公开(公告)号:US12081718B2
公开(公告)日:2024-09-03
申请号:US17225111
申请日:2021-04-08
Applicant: FUJIFILM Business Innovation Corp.
Inventor: Yasunari Kishimoto
IPC: H04N1/44 , G06K9/00 , G06V30/412 , G06V30/414 , H04N1/00
CPC classification number: H04N1/4493 , G06V30/412 , G06V30/414 , H04N1/00331 , H04N1/4433
Abstract: An information processing apparatus includes a processor configured to receive document data, use a first design element forming the document data as a search key to search for a second design element in accordance with a condition of use for a design element, and perform processing on the second design element in accordance with the first design element to generate a third design element that has characteristics of the first design element.
-
公开(公告)号:US12067797B2
公开(公告)日:2024-08-20
申请号:US17408181
申请日:2021-08-20
Applicant: PEPSICO, INC.
Inventor: Jingting Hui
IPC: G06V30/10 , G06F16/35 , G06F16/38 , G06F40/12 , G06F40/183 , G06V30/413 , G06V30/414
CPC classification number: G06V30/414 , G06F16/35 , G06F16/381 , G06F40/12 , G06F40/183 , G06V30/413 , G06V30/10
Abstract: A label processing engine receives, as inputs, raw data representative of a label and baseline data, detects a raw data object within the raw data, classifies the raw data object, and localizes the raw data object within the raw data, detects a baseline data object within the baseline data, classifies the baseline data object, and localizes the baseline data object within the baseline data. The engine recognizes corresponding text within the raw data object and the baseline data object and extracts the corresponding text within the raw data object and the baseline data object, reassembles the corresponding text of the raw data object and the baseline data object into respective lines of text, compares the respective lines of text with one another, and issues a notification based on the comparison.
-
公开(公告)号:US12056948B2
公开(公告)日:2024-08-06
申请号:US17379154
申请日:2021-07-19
Applicant: International Business Machines Corporation
Inventor: Ang Yi , Nazrul Islam , Rajesh M. Desai , Jing Zhang , Dong Rui Li , Xue Mei Deng , Ye Chen , Hai Cheng Wang
IPC: G06K9/00 , G06F18/23 , G06K9/34 , G06K9/62 , G06V30/148 , G06V30/413 , G06V30/414 , G06V30/10
CPC classification number: G06V30/414 , G06F18/23 , G06V30/158 , G06V30/413 , G06V30/10
Abstract: In an approach, a processor identifies a plurality of text separators in a borderless table, a text separator of the plurality of text separators defining a non-text region between two consecutive text lines in the borderless table. A processor classifies the plurality of text separators into a number of target clusters comprised in a target group based on property information related to the plurality of text separators, the number of target clusters corresponding to a number of separator types. A processor provides indication information to indicate respective separator types of the plurality of text separators based on a result of the classifying.
-
公开(公告)号:US20240241880A1
公开(公告)日:2024-07-18
申请号:US18619299
申请日:2024-03-28
Applicant: Pryon Incorporated
Inventor: Ellen Eide Kislal , David Nahamoo , Vaibhava Goel , Etienne Marcheret , Steven John Rennie , Chul Sung , Marie Wenzel Meteer
IPC: G06F16/2452 , G06F16/248 , G06V30/414
CPC classification number: G06F16/24522 , G06F16/248 , G06V30/414
Abstract: A question-answering system that receive a natural-language question includes a database to provide a basis for that answer and a structured-query generator that constructs a structured query from the question and uses it to obtain an answer to the question from the database.
-
-
-
-
-
-
-
-
-