摘要:
Computer-based method of generating a summary of one or more documents comprises identifying content including text having a measurable quality from a predetermined location, evaluating the content, using a computer processor, to determine whether the content represents a document of interest, and preparing a summary of the content if the content represents document of interest. A computer-based method of generating a summary of one or more documents, each including two or more sentences, is also provided.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
摘要:
A summary for a collection of related documents can be generated by extracting phrases from the documents which include common focus elements. Phrase intersection analysis is then performed on the extracted phrases to generate a phrase intersection table, where identical or equivalent phrases are identified. Temporal processing on the phrases in the phrase intersection table is performed to remove ambiguous time references and to sort the phrases in a temporal sequence. Sentence generation is then used to combine the phrases in the phrase intersection table into a coherent summary.
摘要:
System for generating a summary of a plurality of documents is provided. The system includes a computer readable document collection containing a plurality of related documents stored in electronic form therein, a plurality of forms of multiple document summarization engines, and a router for determining a temporal relationship of at least a subset of the documents in the collection and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the subset of documents based on the temporal relationship.
摘要:
A “domain-general” method for topical segmentation of a document input includes the steps of: extracting one or more selected terms from a document; linking occurrences of the extracted terms based upon the proximity of similar terms; and assigning weighted scores to paragraphs of the document input corresponding to the linked occurrences. In accordance with the present invention, the values of the assigned scores depend upon the type of the selected terms, e.g., common noun, proper noun, pronominal, and the position of the linked occurrences with respect to the paragraphs, e.g., front, during, rear, etc. Upon zero-sum normalization, the assigned scores represent the boundaries of the topical segments of the document input.