摘要:
The method of the present invention combines concept searching, document ranking, high speed and efficiency, browsing capabilities, "intelligent" hypertext, document routing, and summarization (machine abstracting) in an easy-to-use implementation. The method of the present invention also offers Boolean and statistical query options. The method of the present invention is based upon "concept indexing" (an index of "word senses" rather than just words.) It builds its concept index from a "semantic network" of word relationships with word definitions drawn from one or more standard human-language dictionaries. During query, users may select the meaning of a word from the dictionary during query construction, or may allow the method to disambiguate words based on semantic and statistical evidence of meaning. This results in a measurable improvement in precision and recall. Results of searching are retrieved and displayed in ranked order. The ranking process is more sophisticated than prior art systems providing ranking because it takes linguistics and concepts, as well as statistics into account.
摘要:
A method and an apparatus for producing an abstract of a document capable of producing concise abstract with correct meaning precisely indicative of the content of the document automatically. The method includes the steps of: listing hint words which are preselected words indicative of presence of significant phrases that can reflect content of the document; searching all the hint words in the document; extracting sentences of the document in which any one of the listed hint words is found by the search; and producing an abstract for the document by juxtaposing the extracted sentences. An apparatus for performing this method is also disclosed.
摘要:
A method and an apparatus for producing an abstract of a document capable of producing concise abstract with correct meaning precisely indicative of the content of the document automatically. The method includes the steps of: listing hint words which are preselected words indicative of presence of significant phrases that can reflect content of the document; searching all the hint words in the document; extracting sentences of the document in which any one of the listed hint words is found by the search; and producing an abstract for the document by juxtaposing the extracted sentences. An apparatus for performing this method is also disclosed.
摘要:
Method for automatically abstracting a document in machine readable form consisting in storing in a dictionary memory (8) language terms commonly used in document preparation, comparing language terms from an input document received from an input register (16) with the stored language terms, selecting language terms from input document which do not compare, selecting language terms from input document which compare, coding the selecting language terms with the identity of the input document and storing the language terms in memory (12). When retrieving a document from storage, the processor (10) under the control of instruction memory (14) compares the words in an input query against the word index file in memory (12) and provides in register (18) the selected documents whose identification code corresponds to the highest retrieval value calculated using each identification code of each language term that compares.
摘要:
Relevance optimized representative content associated with a data storage system is disclosed. One example is a system including a data summarization module, a clustering module, and a representative content selection module. The data summarization module associates, via a processor, each data object in a storage system with a derived data object. The clustering module determines clusters of similar data objects based on a similarity between associated derived data objects, and selects a representative data object for each determined cluster. The representative content selection module selects representative content associated with the storage system, where the representative content is based on the data objects, the derived data objects, and the representative data objects, and relevance optimizes of the selected representative content to an analytics application.
摘要:
A system, computer readable storage medium storing instructions, and computer-implemented method for determining sentiment expressed in documents is disclosed. A document is received from a plurality of documents. A sentence in the document that includes at least one sentiment signature within a predetermined distance of at least one keyword from a list of keywords is identified, wherein the list of keywords is extracted from the plurality of documents and is filtered using a phase transition formula, and wherein the at least one sentiment signature corresponds to an expression of at least one sentiment in the sentence. At least one category corresponding to the at least one keyword of the sentence is determined, wherein the at least one category is included in a list of categories that is generated using the list of keywords. At least one sentiment corresponding to the at least one category is determined based on the at least one sentiment signature.
摘要:
Methods and systems to summarize a source text as a function of contextual information, including to fit a summary within a context-based allotted time. The context-based allotted time may be apportioned amongst multiple portions of the source text, such as by relevance. The context-based allotted time and/or relevance may be user-specified and/or determined, such as by look-up, rule, computation, inference, and/or machine learning. During summary presentation, one or more portions of the source text may be re-summarized, such as to adjust a level of detail. A presentation rate may be user-controllable. Where new and/or changed contextual information affects an available time to review a remaining portion of the summary, the summary presentation may be automatically adjusted, and/or one or more portions of the source text may be re-summarized based on a revised context-based allotted time.
摘要:
In the field of government engagement management, for users of an employee desktop web client, it is now possible, within the web client application, to search and read articles and/or knowledge content that has been authored to external locations. Due to this integration to external, third-party applications, content and/or articles can be displayed to an agent on the employee desktop web client graphical user interface. Agents can enter free text into a specific search field and review the results in summary form, and then select an article in HTML format to progress the current interaction with the client. This functionality adds value to the agent experience and enables the agent to provide an improved service to the end client. Results may be filtered by the search engine as well. Moreover, this system and method improves the operation of the computer in that the computer running such a system in the past was not able to integrate in such a fashion in a web client format. This system and method also enables an agent to handle calls with the web client more efficiently, and allows agents on the web client to automatically classify.
摘要:
An entity-based summary of an electronic book (e-book) is presented to a user of a client device. The e-book to be summarized is identified and multiple entities, e.g., characters, events and dates, referenced in the identified e-book are also identified. A computer server is adapted to determine a type of the e-book to be summarized and to identify one or more external data sources based on the determined type of the e-book, where an external data source provides information about entities in the identified e-book. Upon receiving a request for an entity-based summary of the e-book from the client device, the computer server is adapted to generate an entity-based summary of the e-book, which describes identified entities referenced in a range of the e-book specified in the request. The generated entity-based summary is presented to the client device responsive to the request.