Invention Application
US20060036649A1 Index extraction from documents 审中-公开
从文件索引提取

Index extraction from documents
Abstract:
Systems, methods, and programs embodied in a computer readable medium are provided for index extraction. A plurality of ground truth documents are stored in a database, the ground truth documents being organized in a plurality of classifications. Attempts are made to automatically extract indices from a document based upon a classification associated with the document. The document is reclassified from a first one of the classifications to a second one of the classifications during the course of the automated extraction of the indices by drawing an association between the document and at least one of the ground truth documents. The indices are manually extracted from the document upon a failure to automatically extract the indices. The document is stored in the database as one of the ground truth documents if the indices are manually extracted.
Information query
Patent Agency Ranking
0/0