摘要:
A system for automatic subcharacter unit and lexicon generation for handwriting recognition comprises a processing unit, a handwriting input device, and a memory wherein a segmentation unit, a subcharacter generation unit, a lexicon unit, and a modeling unit reside. The segmentation unit generates feature vectors corresponding to sample characters. The subcharacter generation unit clusters feature vectors and assigns each feature vector associated with a given cluster an identical label. The lexicon unit constructs a lexical graph for each character in a character set. The modeling unit generates a Hidden Markov Model for each set of identically-labeled feature vectors. After a first set of lexical graphs and Hidden Markov Models have been created, the subcharacter generation unit determines for each feature vector which Hidden Markov Model produces a highest likelihood value. The subcharacter generation unit relabels each feature vector according to the highest likelihood value, after which the lexicon unit and the modeling unit generate a new set of lexical graphs and a new set of Hidden Markov models, respectively. The feature vector relabeling, lexicon generation, and Hidden Markov Model generation are performed iteratively until a convergence criterion is met. The final set of Hidden Markov Model model parameters provide a set of subcharacter units for handwriting recognition, where the subcharacter units are derived from information inherent in the sample characters themselves.
摘要:
A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.
摘要:
A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.
摘要:
An apparatus for improving productivity of human reviewers of transcribed documents generated by media conversion systems includes a server/client network of computers, memories and file systems. The server receives and stores voice files created by users of the system. The server is configured for coupling to a speech-to-text media conversion system to receive converted text files of the audio voice files. The server analyzes the converted text files and routes the converted files to the appropriate reviewers according to an adaptive algorithm. The converted files are displayed on the assigned reviewer's screen at the reviewer's workstation. To aid the reviewer in pinpointing potential errors, the workstation displays different segments of the converted files in different colors to reflect different confidence levels of transcription accuracy. Portions of the original voice message that correspond to the potential errors are played back for the reviewer. The reviewers' workstations also perform productivity enhancing functions such as spelling and grammar checking. After the reviewer has made all the necessary corrections, the reviewed files are transmitted back to the server to be stored and accessed by the users. A user database in the server is also updated to store recurrent user-specific errors corrected by the reviewer. A language analysis system is also disposed to adaptively correct user-specific errors in future reviews according to the information in the user database.
摘要:
A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.
摘要:
A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.
摘要:
A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.