FLOATING FORM PROCESSING BASED ON TOPOLOGICAL STRUCTURES OF DOCUMENTS
Abstract:
A method processes scanned floating forms by comparing topological structures of an empty form and a corresponding filled form. The topological structure of each form includes vertical and horizontal ordered sequences of text phrases in the form, which describe directional relationships among text phrases but not their distances. A partial structure alignment method is used in the comparison, where insertions of text in the topological structure of the filled form relative to that of the empty form is not penalized, but deletions and substitutions are penalized. Based on this comparison, some text phrases in the filled form are matched to the those in the empty form, and unmatched text in the filled form is deemed user-filled data. The method further associates user-filled data with fields of the form based on a settings file of the empty form.
Public/Granted literature
Information query
Patent Agency Ranking
0/0