-
公开(公告)号:US20200311413A1
公开(公告)日:2020-10-01
申请号:US16368304
申请日:2019-03-28
Applicant: Konica Minolta Laboratory U.S.A., Inc.
Inventor: Yongmian ZHANG , Shubham AGARWAL
Abstract: Image processing is performed on an input image generated from scanning a filled-in document form. The input image is evaluated against a blank version of various document forms in order to identify the form type of the filled-in document form. The evaluation results in identifying one of the blank document forms as a match to the filled-in document form. Each document form has a set of keywords. The evaluation uses a vector of keyword matches in the filled-in document form. Once a blank document form is identified to be match, the filled-in document form may be categorized according to that document form and/or data extracted from the filled-in document may be stored in association with keywords of that document form.
-
公开(公告)号:US20200311411A1
公开(公告)日:2020-10-01
申请号:US16368312
申请日:2019-03-28
Applicant: Konica Minolta Laboratory U.S.A., Inc.
Inventor: Shubham AGARWAL , Yongmian ZHANG
Abstract: A text recognition method and system involves computing a text matching score between an input text and an output candidate text. The text matching score is computed by evaluating respective N-grams of the input text and the output candidate text. The N-grams are compared in pairs for visual similarity by determining N-gram pair scores, which are used to compute the text matching score. The N-gram pair scores are determined using a set of probabilities of confusion between characters contained in the N-grams. The described approach can address inconsistent results that arise from conventional text similarity quantifiers.
-