摘要:
Methods and computer-readable media for associating words or groups of words distilled from content, such as reported speech or an attitude report, of a document to form semantic relationships collectively used to generate a semantic representation of the content are provided. Semantic representations may include elements identified or parsed from a text portion of the content, the elements of which may be associated with other elements that share a semantic relationship, such as an agent, location, or topic relationship. Relationships may also be developed by associating one element that is in relation to, or is about, another element, thereby allowing for rapid and effective comparison of associations found in a semantic representation with associations derived from queries. The semantic relationships may be determined based on semantic information, such as potential meanings and grammatical functions of each element within the text portion of the content.
摘要:
Technologies are described herein for generating a semantic translation rule to support natural language search. In one method, a first expression and a second expression are received. A first representation is generated based on the first expression, and a second representation is generated based on the second expression. Aligned pairs of a first term in the first representation and a second term in the second representation are determined. For each aligned pair, the first term and the second term are replaced with a variable associated with the aligned pair. Word facts that occur in both the first representation and the second representation are removed from the first representation and the second representation. The remaining word facts in the first representation are replaced with a broader representation of the word facts. The translation rule including the first representation, an operator, and the second semantic representation is generated.
摘要:
Techniques are provided to determine service data features from an archive of web service transactions. Data features for functionally identical classes of service are determined. Differentiating data feature patterns uniquely identifying each service within the class are learned using machine learning, clustering, statistical analysis and the like. A service map associating services with the differentiating patterns is determined. The service map contains data feature patterns that differentiate among otherwise functionally identical services. The data features are optionally associated with past usage, objective and subjective service quality measurements and the like. The data features of the received service requests are compared to differentiating patterns in the service map. The service associated with the differentiating patterns matching the data features of the service request is selected. The data features of the service request may include, but document language, document genre, number of words or characters, type of images, subject matter of images and the like.
摘要:
Techniques are provided for determining collaborative notes and automatically recognizing speech, handwriting and other type of information. Domain and optional actor/speaker information associated with the support information is determined. An initial automatic speech recognition model is determined based on the domain and/or actor information. The domain and/or actor/speaker language model is used to recognize text in the speech information associated with the support information. Presentation support information such as slides, speaker notes and the like are determined. The semantic overlap between the support information and the salient non-function words in the recognized text and collaborative user feedback information are used to determine relevancy scores for the recognized text. Grammaticality, well formedness, self referential integrity and other features are used to determine correctness scores. Suggested collaborative notes are displayed in the user interface based on the salient non-function words. User actions in the user interface determine feedback signals. Recognition models such as automatic speech recognition, handwriting recognition are determined based on the feedback signals and the correctness and relevance scores.
摘要:
Computer-readable media and a computer system for implementing a natural language search using fact-based structures and for generating such fact-based structures are provided. A fact-based structure is generated using a semantic structure, which represents information, such as text, from a document, such as a web page. Typically, a natural language parser is used to create a semantic structure of the information, and the parser identifies terms, as well as the relationship between the terms. A fact-based structure of a semantic structure allows for a linear structure of these terms and their relationships to be created, while also maintaining identifiers of the terms to convey the dependency of one fact-based structure on another fact-based structure. Additionally, synonyms and hypernyms are identified while generating the fact-based structure to improve the accuracy of the overall search.
摘要:
Techniques are provided for determining collaborative notes and automatically recognizing speech, handwriting and other type of information. Domain and optional actor/speaker information associated with the support information is determined. An initial automatic speech recognition model is determined based on the domain and/or actor information. The domain and/or actor/speaker language model is used to recognize text in the speech information associated with the support information. Presentation support information such as slides, speaker notes and the like are determined. The semantic overlap between the support information and the salient non-function words in the recognized text and collaborative user feedback information are used to determine relevancy scores for the recognized text. Grammaticality, well formedness, self referential integrity and other features are used to determine correctness scores. Suggested collaborative notes are displayed in the user interface based on the salient non-function words. User actions in the user interface determine feedback signals. Recognition models such as automatic speech recognition, handwriting recognition are determined based on the feedback signals and the correctness and relevance scores.
摘要:
A technique for compressing texts such that referential integrity, sentence coherency, punctuation and readability are preserved and which provides for compression of sentence constituents based on the type of content, the informativity of the sentence constituent and the grammatical readability of the resultant sentence or phrase. Information content portions are parsed to generate parts of speech tags. The informativity of the constituents in a phrase or sentence is determined and the parts of speech having lower information content and having a low effect on grammatical readability of the phrase or sentence are selectively compressed. Parts of speech having successively higher informativity and low effect on grammatical readability are selected for compression until the desired level of compression is reached. Compressed portions are indicated in the summary with a selectable placeholder which expands to display the compressed text.
摘要:
A technique for teaching expository writing using a system that provides an objective reader centric microanalysis of the information a writer has conveyed to a virtual reader. The technique uses a theory of discourse analysis such as the Linguistic Discourse Model. Using the technique, a text is segmented into discrete units of meaning of the selected theoretic model. Student analysis and understanding are facilitated by the assignment of types to the discrete units of meaning and by linking the discrete units of meaning into a discourse tree under the constraints imposed by the selected theory. A virtual, or objective, reader centric summary of the information actually conveyed by the text is then compared to the writer designated important concepts and the results conveyed as feedback to the writer.
摘要:
Techniques for dynamic personalized reading instruction at word and sentence level are provided by determining word recognition level and learning gradient information for a user. Comprehension aids are associated with words classified by word recognition level and stored. Word recognition errors are determined, comprehension aids presented and word recognition level adjusted based on determined word recognition errors, learning gradient and current word recognition level. For sentence level dynamic personalized reading instruction personalization information, reading level and learning gradient are determined and a personalized grammatical tunable text summary generated. Based on the personalized grammatical tunable text summary, comprehension questions are generated and displayed. Based on comprehension responses, learning gradient and personalization information, the reading level is adjusted. Personalized reading instruction is provided by selectively changing display attributes of more salient information to help a user identify the important information in the sentence and to maintain fluid reading.
摘要:
A technique for teaching second language writing skills provides for analyzing a user text. The user text is analyzed and compared to a writing culture. The differences between the user text and the writing culture are identified. The identified differences are compared to linguistic flaw information previously compiled from other second language texts written by first language writers in the writing culture. Identified differences that are found in the linguistic flaw information store are used to retrieve contextually relevant corrections and comments for addressing the identified flaws based on the first and second language and writing culture.