发明授权
US08037403B2 Apparatus, method, and computer program product for extracting structured document 有权
用于提取结构化文档的装置,方法和计算机程序产品

Apparatus, method, and computer program product for extracting structured document
摘要:
An apparatus for retrieving a structured document including a first specifying unit that specifies a plurality of object documents from a plurality of structured documents being accessible via a network, the object document being the structured document according to retrieval condition; a first extracting unit that extracts text included in the object document; a second extracting unit that extracts metadata appended to the object document, the metadata being first data concerning the text of the object document and second data indicating a link relation between the object document and the structured documents; and a first calculating unit that calculates importance of each of the object documents, based on the text and the metadata of each of the object documents.
信息查询
0/0