Invention Application
- Patent Title: Index extraction from documents
- Patent Title (中): 从文件索引提取
-
Application No.: US10916877Application Date: 2004-08-12
-
Publication No.: US20060036614A1Publication Date: 2006-02-16
- Inventor: Steven Simske , David Wright
- Applicant: Steven Simske , David Wright
- Main IPC: G06F7/00
- IPC: G06F7/00

Abstract:
Systems, methods, and programs embodied in a computer readable medium are provided for index extraction. Stored in a database are ground truth documents that are organized according to a plurality of classifications, each classification having a group of predefined indices. A document to be indexed is classified by drawing an association between the document and one of the classifications. An attempt is made to extract from the document at least a subset of the group of predefined indices associated with the one of the classifications. Upon a failure to extract the subset of the group of predefined indices, attempts are made to find and correct at least one text recognition error in the document based upon a salient dictionary associated with the one of the classifications.
Public/Granted literature
- US08805803B2 Index extraction from documents Public/Granted day:2014-08-12
Information query