摘要:
An apparatus is provided for detecting the presence of speech within an input speech signal. Speech is detected by treating the average frame energy of an input speech signal as a sampled signal and looking for modulations within the sampled signal that are characteristic of speech.
摘要:
A method and apparatus for enhancing the performance of speech recognition by adaptively changing a process of determining the final, recognized word depending on a user's selection in a list of alternative words represented by a result of speech recognition. A speech recognition method comprising: inputting speech uttered by a user; recognizing the input speech and creating a predetermined number of alternative words to be recognized in the order of similarity; and displaying a list of alternative words arranged in a predetermined order and determining an alternative word that a cursor currently indicates as the final, recognized word if a user's selection from the list of alternative words has not been changed within a predetermined standby time.
摘要:
Method of confirming the establishment of a voice connection, such as a VoIP connection, between first and second end stations coupled to a packet switched communications network. The voice connection is used to transfer an audible request from the first end station to the second end station, to ask the user of the second end station to generate a predetermined vocal response. The first end station compares any response from the second end station to the predetermined vocal response. The connection is determined to be established in response to a successful comparison. The predetermined vocal response includes a predetermined speech sequence comprising characters, a word or words, and a speech recognition procedure is applied to the received response to determine the presence of any speech sequence for comparison with the predetermined sequence. If a fault is detected an alternative connection is established to execute a process to correct the fault.
摘要:
In this invention vocabulary size of a speech recognizer for a large task is reduced by providing a recognizer only for the most common vocabulary items. Uncommon items are catered for by providing aliases from the common items. This allows accuracy to remain high while also allowing uncommon items to be recognized when necessary.
摘要:
A system and method for a speech recognition system application program interface (API). The system and method additionally enable the application programmer to generate multiple grammars and voice channels, such that the audio data in any voice channel may be decoded utilizing any active grammar. The system and method enable the dynamic updating of grammars without reloading or rebooting the system. Additionally, the grammar can be implemented to include multiple grammars having multiple concepts. Still further, each concept can be implemented to include multiple phrases, and the system and method are configured to decode flexible phrase formats.
摘要:
The present invention provides an audio analysis intelligence tool that provides ad-hoc search capabilities using spoken words as an organized data form. The present invention provides an SQL like interface to process and search audio data and combine it with other traditional data forms.
摘要:
A method for storing acoustic information is described, characterized in that the information to be stored is combined into groups, and each group of information to be stored is assigned a group identifier characterizing that particular group, as well as a method for selecting information stored by the method according to the present invention, said information being characterized in that after input of a group identifier, preferably voice input via a microphone, a particular group of information is selected. The present invention permits a particularly rapid means of retrieving and selecting voice information stored in a voice memory.
摘要:
A multiple pass speech recognition method includes a first pass and a second pass. The first pass recognizes an input speech signal to generate a first pass result. The second pass generates a first grammar having a portion set to match a first part of the input speech signal, based upon the context of the first pass result, and generate a second pass result. The method may further include a third pass grammar limiting the second part of the input speech signal to the second pass result. The third pass grammar includes a model corresponding to the first part of the input speech signal and varying within the second pass result. The third pass compares the first part of the input speech signal to the model while limiting the second part of the input speech signal to the second pass result.
摘要:
A voice command identifier for a voice recognition system is disclosed. In one aspect of the invention, the voice command identifier can selectively identify and recognize a user voice command received along with the background sound generated from the speaker of a device being controlled.
摘要:
The invention provides a method of speech recognition comprising the steps of receiving a signal comprising one or more spoken words, extracting a spoken word from the signal using a Hidden Markov Model, passing the spoken word to a plurality of word models, one or more of the word models based on a Hidden Markov Model, determining the word model most likely to represent the spoken word, and outputting the word model representing the spoken word. The invention also provides a related speech recognition system and a speech recognition computer program.