摘要:
System for generating a summary of a plurality of documents is provided. The system includes a computer readable document collection containing a plurality of related documents stored in electronic form therein, a plurality of forms of multiple document summarization engines, and a router for determining a temporal relationship of at least a subset of the documents in the collection and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the subset of documents based on the temporal relationship.
摘要:
Computer-based method of generating a summary of one or more documents comprises identifying content including text having a measurable quality from a predetermined location, evaluating the content, using a computer processor, to determine whether the content represents a document of interest, and preparing a summary of the content if the content represents document of interest. A computer-based method of generating a summary of one or more documents, each including two or more sentences, is also provided.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
摘要:
A system for automatically generating a dictionary from full text articles extracts pairs from full text articles and stores the pairs as dictionary entries. The system includes a computer readable corpus having a plurality of documents therein. A pattern processing module (120) and a grammar processing module (125) are provided for extracting pairs from the corpus and storing the pairs in a dictionary database (145). A routing processing module selectively routes sentences in the corpus to at least one of the pattern processing module or grammar processing module. In one embodiment, the routing module is incorporated into the pattern processing module which then selectively routes a portion of the sentences to the grammar processing module. A bootstrapping processing module (150) can be used to apply entries against the corpus to identify and extract additional entries.