摘要:
Some embodiments are directed to identifying semantic properties of documents using free-text annotations associated with the documents. Semantic properties of documents may be identified by using a model that is trained on a corpus of training documents where one or more of the training documents may include free-text annotations. In some embodiments, the model may identify semantic topics expressed only in free-text annotations or only in the body of a document. The model may applied to identify semantic topics associated with a work document or to summarize the semantic topics present in a plurality of work documents.
摘要:
A system and method that enables a plurality of lay users to collaborate on automating computer tasks is disclosed. In one embodiment, the system automatically performs these tasks, rather than just documenting how to perform them. The system allows a database of solutions to be built for every important computer task. A key characteristic of this system is that users contribute to this database by simply performing the task. The system records the graphical user interface (GUI) actions as the user performs the task. It aggregates GUI traces from multiple users into a canonical sequence of GUI actions parameterized by user-environment that will successfully accomplish the task on a variety of different configurations. A classifier is used to predict which steps are likely to be misinterpreted and requests human intervention to properly perform them. This process can be done iteratively until the translation is believed to be correct.
摘要:
System for generating a summary of a plurality of documents is provided. The system includes a computer readable document collection containing a plurality of related documents stored in electronic form therein, a plurality of forms of multiple document summarization engines, and a router for determining a temporal relationship of at least a subset of the documents in the collection and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the subset of documents based on the temporal relationship.
摘要:
Computer-based method of generating a summary of one or more documents comprises identifying content including text having a measurable quality from a predetermined location, evaluating the content, using a computer processor, to determine whether the content represents a document of interest, and preparing a summary of the content if the content represents document of interest. A computer-based method of generating a summary of one or more documents, each including two or more sentences, is also provided.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
摘要:
A system and method that enables a plurality of lay users to collaborate on automating computer tasks is disclosed. In one embodiment, the system automatically performs these tasks, rather than just documenting how to perform them. The system allows a database of solutions to be built for every important computer task. A key characteristic of this system is that users contribute to this database by simply performing the task. The system records the graphical user interface (GUI) actions as the user performs the task. It aggregates GUI traces from multiple users into a canonical sequence of GUI actions parameterized by user-environment that will successfully accomplish the task on a variety of different configurations. A classifier is used to predict which steps are likely to be misinterpreted and requests human intervention to properly perform them. This process can be done iteratively until the translation is believed to be correct.
摘要:
A summary for a collection of related documents can be generated by extracting phrases from the documents which include common focus elements. Phrase intersection analysis is then performed on the extracted phrases to generate a phrase intersection table, where identical or equivalent phrases are identified. Temporal processing on the phrases in the phrase intersection table is performed to remove ambiguous time references and to sort the phrases in a temporal sequence. Sentence generation is then used to combine the phrases in the phrase intersection table into a coherent summary.
摘要:
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.