摘要:
An advanced telecommunications system is provided for the recognizing of spoken commands over a cellular telephone, satellite telephone, or personal communications network. In the cellular application, for example, a Speech Recognition System interconnects either internally with or as an external peripheral to a cellular telecommunications switch. The Speech Recognition System includes an administrative subsystem, a call processing subsystem, a speaker-dependent recognition subsystem, a speaker-independent recognition subsystem, and a data storage subsystem. The Speech Recognition System also allows for increased efficiency in the cellular telephone network by integrating with the switch or switches as a shared resource. The administrative subsystem of the Speech Recognition System is used to keep statistical logs of pertinent call information. Pre-recorded instructional messages are stored in the memory of the call processing subsystem for instructing a user on his or her progress in using the system. The speaker-independent recognition subsystem allows the user to interact with the system employing non-user specific functions. User specific functions are controlled with the speaker-dependent recognition subsystem. User specific attributes collected by the recognition subsystems are stored in the data storage subsystem.
摘要:
An advanced telecommunications system is provided for the recognizing of spoken commands over a cellular telephone, satellite telephone, or personal communications network. In the cellular application, for example, a Speech Recognition System interconnects either internally with or as an external peripheral to a cellular telecommunications switch. The Speech Recognition System includes an administrative subsystem, a call processing subsystem, a speaker-dependent recognition subsystem, a speaker-independent recognition subsystem, and a data storage subsystem. The Speech Recognition System also allows for increased efficiency in the cellular telephone network by integrating with the switch or switches as a shared resource. The administrative subsystem of the Speech Recognition System is used to keep statistical logs of pertinent call information. Pre-recorded instructional messages are stored in the memory of the call processing subsystem for instructing a user on his or her progress in using the system. The speaker-independent recognition subsystem allows the user to interact with the system employing non-user specific functions. User specific functions are controlled with the speaker-dependent recognition subsystem. User specific attributes collected by the recognition subsystems are stored in the data storage subsystem.
摘要:
Speech recognition is improved using reference pattern templates which have an added noise signal (noise floor) to avoid LPC high-gain synthesizer instability at low signal levels. Also, input signal frames have a length one-half that of reference frames whereby dynamic time warp computation steps are cut almost in half.
摘要:
An advanced telecommunications system is provided for the recognizing of spoken commands over a cellular telephone, satellite telephone, or personal communications network. In the cellular application, for example, a Speech Recognition System interconnects either internally with or as an external peripheral to a cellular telecommunications switch. The Speech Recognition System includes an administrative subsystem, a call processing subsystem, a speaker-dependent recognition subsystem, a speaker-independent recognition subsystem, and a data storage subsystem. The Speech Recognition System also allows for increased efficiency in the cellular telephone network by integrating with the switch or switches as a shared resource. The administrative subsystem of the Speech Recognition System is used to keep statistical logs of pertinent call information. Pre-recorded instructional messages are stored in the memory of the call processing subsystem for instructing a user on his or her progress in using the system. The speaker-independent recognition subsystem allows the user to interact with the system employing non-user specific functions. User specific functions are controlled with the speaker-dependent recognition subsystem. User specific attributes collected by the recognition subsystems are stored in the data storage subsystem.
摘要:
The present invention comprises a method for reducing the database requirements necessary for use in speaker independent recognition systems. The method involves digital processing of a plurality of recorded utterances from a first database of digitally recorded spoken utterances. The previously recorded utterances are digitally processed to create a second database of modified utterances and then the first and second databases are combined to form an expanded database from which recognition vocabulary tables may be generated.
摘要:
The present invention describes a system and method for enabling a caller to obtain access to services via a telephone network by entering a spoken first character string having a plurality of digits. Preferably, the method includes the steps of prompting the caller to speak the first character string beginning with a first digit and ending with a last digit thereof, recognizing each spoken digit of the first character string using a speaker-independent voice recognition algorithm, and then following entry of the last digit of the first string, initially verifying the caller's identity using a voice verification algorithm. After initial verification, the caller is again prompted to enter a second character string, which must also be recognized before access is effected.
摘要:
Speaker-independent word recognition is performed, based on a small acoustically distinct vocabulary, with minimal hardware requirements. After a simple preconditioning filter, the zero crossing intervals of the input speech are measured and sorted by duration, to provide a rough measure of the frequency distribution within each input frame. The distribution of zero crossing intervals is transformed into a binary feature vector, which is compared with each reference template using a modified Hamming distance measure. A dynamic time warping algorithm is used to permit recognition of various speaker rates, and to economize on the reference template storage requirements. A mask vector with each reference vector on a template is used to ignore insignificant (or speaker-dependent) features of the words detected.
摘要:
Voice activated dialing is described for use in a mobile telecommunications system. A voice input is received from a wireless network user. A telephone number to be dialed is determined by using speaker independent speech recognition to interpret a string of spoken digits in the voice input to determine the telephone number, or using speaker dependent speech recognition to interpret a spoken word in the voice input to determine the telephone number. A telephone call is then initiated by dialing the telephone number.
摘要:
An advanced telecommunications system is provided for the recognizing of spoken commands over a cellular telephone, satellite telephone, or personal communications network. In the cellular application, for example, a Speech Recognition System interconnects either internally with or as an external peripheral to a cellular telecommunications switch. The Speech Recognition System includes an administrative subsystem, a call processing subsystem, a speaker-dependent recognition subsystem, a speaker-independent recognition subsystem, and a data storage subsystem. The Speech Recognition System also allows for increased efficiency in the cellular telephone network by integrating with the switch or switches as a shared resource. The administrative subsystem of the Speech Recognition System is used to keep statistical logs of pertinent call information. Pre-recorded instructional messages are stored in the memory of the call processing subsystem for instructing a user on his or her progress in using the system. The speaker-independent recognition subsystem allows the user to interact with the system employing non-user specific functions. User specific functions are controlled with the speaker-dependent recognition subsystem. User specific attributes collected by the recognition subsystems are stored in the data storage subsystem.
摘要:
A technique for improving the recognition accuracy of a speech recognizer includes deploying the speech recognizer, wherein live input data is received by the recognizer as an input for a given speaker independent adaptation algorithm associated with the speech recognizer. The algorithm enhances the accuracy of the speech recognizer without human supervision. This technique is particularly suitable for adapting a large vocabulary ASR engine.