摘要:
A speech system recognizes words from a spoken phrase that conform to checksum constraints. Grammar rules are applied to hypothesize words according to the checksum constraints. The checksum associated with the phrase is thus inherent in the grammar. Sentences which do not meet a predetermined checksum constraint are not valid under the grammar rules and are therefore inherently rejected. The checksum constraints result in increased recognition accuracy.
摘要:
A method and system are provided for time aligning speech. Speech data is input representing speech signals from a speaker. An orthographic transcription is input including a plurality of words transcribed from the speech signals. A sentence model is generated indicating a selected order of the words in response to the orthographic transcription. In response to the orthographic transcription, word models are generated associated with respective ones of the words. The orthographic transcription is aligned with the speech data in response to the sentence model, to the word models and to the speech data.
摘要:
This is an automated speech recognition system, the system comprising: an input device for receiving voice signals; a means for computing the voice signals into stochastic RGDAGs and individual grammars; a search engine that directly processes the stochastic RGDAGs; a means for adding the individual grammars within the RGDAG; a means for replacing the individual grammars within the RGDAG; and a means for deleting the individual grammars within the RGDAG. The system can include a means for adding modified grammars within RGDAGs; a means for linking a terminal symbol from a first grammar to a start symbol in a second grammar; a means for finding a maximum depth of each symbol in the RGDAG; a means for determining if the system requires parse information for the grammar by looking for ancestor symbols; a means for specifing a plurality of RGs within the RGDAGs as starting points for the search engine; and a means for returning a parse through each of a plurality of RGs in the RGDAGs for a recognized utterance. Other devices, systems and methods are also disclosed.
摘要:
A system and method are provided herein to support text and data entry for computer applications and the collection, processing, storage, and display of associated text, audio, image, video, and related data.
摘要:
This is a voice activated Hypermedia system using grammatical metadata, the system comprising: a speech user agent; a browsing module; and an information resource. The system may include: embedded intelligence in hypermedia source; a means for processing the actions of a user based on the embedded intelligence; a means for returning a result of the actions to the user. In addition, the hypermedia source maybe a HTML page or an instructional module for communicating allowed actions by a user. The system may also include embedded intelligence as a grammar or reference to a grammar. The grammar may be dynamically added to a speech recognizer. In addition, the actions can come from a speech recognizer. Furthermore, the system may include voice activated hypermedia links and intelligent modules that process information from the information resources for allowing actions from the user. Other devices, systems and methods are also disclosed.
摘要:
This is a voice activated Hypermedia system using grammatical metadata, the system comprising: a speech user agent; a browsing module; and an information resource. The system may include: embedded intelligence in hypermedia source; a means for processing the actions of a user based on the embedded intelligence; a means for returning a result of the actions to the user. In addition, the hypermedia source maybe a HTML page or an instructional module for communicating allowed actions by a user. The system may also include embedded intelligence as a grammar or reference to a grammar. The grammar may be dynamically added to a speech recognizer. In addition, the actions can come from a speech recognizer. Furthermore, the system may include voice activated hypermedia links and intelligent modules that process information from the information resources for allowing actions from the user. Other devices, systems and methods are also disclosed.
摘要:
A grammar learning aid is implemented with a processor and read only memory programmed in a manner that minimizes memory space and maximizes effective teaching. The programming is based on a unification grammar, in which sentences are determined by their features. The programming is implemented with logic programming in which predicates represent data, and in which logic variables permit features to be unified. This feature-based grammar approach permits an erroneous answer to be associated with a feature and that feature used to generate a similar device.
摘要:
A text-to-pronunciation system (11) includes a large training set of word pronunciations (19) and an extractor for extracting language specific information from the training set to produce pronunciations for words not in its training set. A learner (13) forms pronunciation guesses for words in the training set and for finding a transformation rule that improves the guesses. A rule applier (15) applies the transformation rule found to guesses. The learner (13) repeats the finding of another rule and the rule applier (15) applies the new rule to find the rules that improves the guesses the most.
摘要:
A method for parsing for natural languages includes a grammar and a lexicon. A knowledge base may be used to define elements in the lexicon. A processor receives single words input by a user and adds them to a sentence under construction. Valid next words are predicted after each received input word. The preferred system has two major components: a parser and a predictor. The predictor accesses only the lexicon and the knowledge base, if one is used, to determine the valid next input words. The parser constructs sentences which are valid according to the grammar out of words accepted by the predictor.
摘要:
This is a voice activated Hypermedia system using grammatical metadata, the system comprising: a speech user agent; a browsing module; and an information resource. The system may include: embedded intelligence in hypermedia source; a means for processing the actions of a user based on the embedded intelligence; a means for returning a result of the actions to the user. In addition, the hypermedia source maybe a HTML page or an instructional module for communicating allowed actions by a user. The system may also include embedded intelligence as a grammar or reference to a grammar. The grammar may be dynamically added to a speech recognizer. In addition, the actions can come from a speech recognizer. Furthermore, the system may include voice activated hypermedia links and intelligent modules that process information from the information resources for allowing actions from the user. Other devices, systems and methods are also disclosed.