摘要:
System for generating a summary of a plurality of documents is provided. The system includes a computer readable document collection containing a plurality of related documents stored in electronic form therein, a plurality of forms of multiple document summarization engines, and a router for determining a temporal relationship of at least a subset of the documents in the collection and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the subset of documents based on the temporal relationship.
摘要:
Computer-based method of generating a summary of one or more documents comprises identifying content including text having a measurable quality from a predetermined location, evaluating the content, using a computer processor, to determine whether the content represents a document of interest, and preparing a summary of the content if the content represents document of interest. A computer-based method of generating a summary of one or more documents, each including two or more sentences, is also provided.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
摘要:
Methods and apparatus are disclosed for classifying the relative position of one or more text messages (including transcribed audio messages) in a related thread of text messages. One or more classifiers are applied to the text messages; and a classification of the text messages is obtained that indicates the relative position of the text messages in the thread. For example, a thread can include a root message, a leaf message and one or more inner messages, and the classification can indicate whether each text message is a root message, a leaf message or an inner message. The classifiers are trained on a set of training messages that have been previously classified to indicate a relative position of each training message in a corresponding thread. The classifiers employ one or more features that help to distinguish between root and non-root messages.
摘要:
A method and apparatus are provided for summarizing a text message, such as an email message or a transcribed audio message. A portion of each text message, such as a sentence, is extracted as an indicative summary of the text message based on a degree of overlap of words in the sentence with a set of words, such as words in the message subject or words in a related root message. The extracted portion is based on a score for each portion of the text message, such as a sentence. An interface is also provided for presenting the indicative summaries of a set of related text messages to a user.