System and method for automatic subcharacter unit and lexicon generation
for handwriting recognition
    1.
    发明授权
    System and method for automatic subcharacter unit and lexicon generation for handwriting recognition 失效
    用于手写识别的自动子字符单元和词典生成的系统和方法

    公开(公告)号:US5757964A

    公开(公告)日:1998-05-26

    申请号:US901989

    申请日:1997-07-29

    IPC分类号: G06K9/62 G06K9/72

    CPC分类号: G06K9/6297 G06K9/6255

    摘要: A system for automatic subcharacter unit and lexicon generation for handwriting recognition comprises a processing unit, a handwriting input device, and a memory wherein a segmentation unit, a subcharacter generation unit, a lexicon unit, and a modeling unit reside. The segmentation unit generates feature vectors corresponding to sample characters. The subcharacter generation unit clusters feature vectors and assigns each feature vector associated with a given cluster an identical label. The lexicon unit constructs a lexical graph for each character in a character set. The modeling unit generates a Hidden Markov Model for each set of identically-labeled feature vectors. After a first set of lexical graphs and Hidden Markov Models have been created, the subcharacter generation unit determines for each feature vector which Hidden Markov Model produces a highest likelihood value. The subcharacter generation unit relabels each feature vector according to the highest likelihood value, after which the lexicon unit and the modeling unit generate a new set of lexical graphs and a new set of Hidden Markov models, respectively. The feature vector relabeling, lexicon generation, and Hidden Markov Model generation are performed iteratively until a convergence criterion is met. The final set of Hidden Markov Model model parameters provide a set of subcharacter units for handwriting recognition, where the subcharacter units are derived from information inherent in the sample characters themselves.

    摘要翻译: 用于手写识别的自动子字符单元和词典生成的系统包括处理单元,手写输入装置和存储器,其中存在分割单元,子字符生成单元,词典单元和建模单元。 分割单元生成与采样字符对应的特征矢量。 子字符生成单元簇特征向量并且将与给定簇相关联的每个特征向量分配给相同的标签。 词典单元为字符集中的每个字符构成一个词汇图。 建模单元为每组相同标记的特征向量生成隐马尔科夫模型。 在创建了第一组词汇图和隐马尔科夫模型之后,子字符生成单元为每个特征向量确定隐马尔可夫模型产生最高似然值。 子字符生成单元根据最高似然值重新标记每个特征向量,之后词法单元和建模单元分别生成一组新的词法图和一组新的隐马尔可夫模型。 迭代地执行特征向量重新标记,词法生成和隐马尔科夫模型生成,直到满足收敛标准。 最后一组隐马尔可夫模型参数提供了一组用于手写识别的子字符单元,其中子字符单元是从样本字符本身固有的信息导出的。

    Speech-responsive voice messaging system and method

    公开(公告)号:US06539078B1

    公开(公告)日:2003-03-25

    申请号:US09504059

    申请日:2000-02-14

    IPC分类号: H04M3487

    CPC分类号: G10L15/22

    摘要: A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.

    Speech-responsive voice messaging system and method

    公开(公告)号:US6094476A

    公开(公告)日:2000-07-25

    申请号:US822034

    申请日:1997-03-24

    CPC分类号: G10L15/22

    摘要: A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.

    Method of and apparatus for improving productivity of human reviewers of automatically transcribed documents generated by media conversion systems
    4.
    发明授权
    Method of and apparatus for improving productivity of human reviewers of automatically transcribed documents generated by media conversion systems 有权
    提高媒体转换系统产生的自动转录文件的人力资源评估者生产率的方法和设备

    公开(公告)号:US07236932B1

    公开(公告)日:2007-06-26

    申请号:US09659861

    申请日:2000-09-12

    申请人: Kamil Grajski

    发明人: Kamil Grajski

    IPC分类号: G10L21/00

    CPC分类号: G10L15/26 G10L2015/225

    摘要: An apparatus for improving productivity of human reviewers of transcribed documents generated by media conversion systems includes a server/client network of computers, memories and file systems. The server receives and stores voice files created by users of the system. The server is configured for coupling to a speech-to-text media conversion system to receive converted text files of the audio voice files. The server analyzes the converted text files and routes the converted files to the appropriate reviewers according to an adaptive algorithm. The converted files are displayed on the assigned reviewer's screen at the reviewer's workstation. To aid the reviewer in pinpointing potential errors, the workstation displays different segments of the converted files in different colors to reflect different confidence levels of transcription accuracy. Portions of the original voice message that correspond to the potential errors are played back for the reviewer. The reviewers' workstations also perform productivity enhancing functions such as spelling and grammar checking. After the reviewer has made all the necessary corrections, the reviewed files are transmitted back to the server to be stored and accessed by the users. A user database in the server is also updated to store recurrent user-specific errors corrected by the reviewer. A language analysis system is also disposed to adaptively correct user-specific errors in future reviews according to the information in the user database.

    摘要翻译: 用于提高人类审阅者对由媒体转换系统生成的转录文档的生产力的装置包括计算机,存储器和文件系统的服务器/客户端网络。 服务器接收并存储系统用户创建的语音文件。 服务器被配置为耦合到语音到文本媒体转换系统以接收音频语音文件的转换的文本文件。 服务器根据自适应算法分析转换的文本文件,并将转换的文件路由到相应的审阅者。 转换后的文件将显示在审阅者工作站上的分配审阅者屏幕上。 为了帮助审稿人精确定位潜在错误,工作站以不同的颜色显示转换文件的不同部分,以反映不同的转录精度的置信水平。 对于潜在错误对应的原始语音消息的部分将被回放给审阅者。 评审员的工作站还执行生产力增强功能,如拼写检查和语法检查。 审查人员进行了所有必要的更正后,审核后的文件将被传回服务器,由用户进行存储和访问。 还更新了服务器中的用户数据库,以便存储由审阅者更正的经常出现的用户特定错误。 还配置语言分析系统,根据用户数据库中的信息自适应地纠正未来评论中的用户特定错误。

    Speech-responsive voice messaging system and method

    公开(公告)号:US06522726B1

    公开(公告)日:2003-02-18

    申请号:US09503655

    申请日:2000-02-14

    IPC分类号: H04M164

    CPC分类号: G10L15/22

    摘要: A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.

    Speech-responsive voice messaging system and method

    公开(公告)号:US06385304B1

    公开(公告)日:2002-05-07

    申请号:US09503314

    申请日:2000-02-14

    IPC分类号: H04M164

    CPC分类号: G10L15/22

    摘要: A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.

    Speech-responsive voice messaging system and method
    7.
    发明授权
    Speech-responsive voice messaging system and method 有权
    语音响应语音消息系统和方法

    公开(公告)号:US06377662B1

    公开(公告)日:2002-04-23

    申请号:US09503409

    申请日:2000-02-14

    IPC分类号: H04M3487

    CPC分类号: G10L15/22

    摘要: A system and method for speech-responsive voice messaging, in which a Speech-Responsive Voice Messaging System (SRVMS) preferably provides a hierarchically-simple speech user interface (UI) that enables subscribers to use speech to specify commands such as mailboxes, passwords, and digits. The SRVMS generates and evaluates candidate results. The SRVMS invokes a speech UI navigation operation or a voice messaging operation according to the outcome of the evaluation of the candidate results. In the preferred embodiment, the SRVMS determines whether the candidate results are good, questionable, or bad; and whether two or more candidate results are ambiguous due to a likelihood that each such result could be a valid command. If the candidate results are questionable or ambiguous, an ambiguity resolution UI prompts the subscriber to confirm whether the best candidate result is what the subscriber intended. In response to repeated speech recognition failures, the SRVMS transfers the subscriber to a Dual Tone Multi Frequency (DTMF) UI. Transfer to the DTMF UI is also performed in response to detection of predetermined DTMF signals issued by the subscriber while the speech UI is in context. The SRVMS provides a logging unit and a reporting unit which operate in parallel with the speech UI, in a manner that is transparent to subscribers. The logging unit directs the selective logging of subscriber utterances, and the reporting unit selectively generates and maintains system performance statistics on multiple detail levels.

    摘要翻译: 一种用于语音响应语音消息的系统和方法,其中语音响应语音消息系统(SRVMS)优选地提供分层简单的语音用户界面(UI),其使订户能够使用语音来指定诸如邮箱,密码, 和数字。 SRVMS生成并评估候选结果。 SRVMS根据候选结果的评估结果调用语音UI导航操作或语音消息传送操作。 在优选实施例中,SRVMS确定候选结果是否良好,有问题或不好; 以及由于每个这样的结果可能是有效的命令的可能性,两个或更多候选结果是否是模糊的。 如果候选结果是有问题或不明确的,则歧义解决UI提示用户确认最佳候选结果是用户想要的结果。 响应于重复的语音识别故障,SRVMS将用户传送到双音多频(DTMF)UI。 响应于在语音UI处于上下文中时由用户发出的预定DTMF信号的检测也执行到DTMF UI的传送。 SRVMS提供了一种记录单元和报告单元,其以与用户透明的方式与语音UI并行操作。 记录单元指示用户话语的选择性记录,并且报告单元在多个细节级别上选择性地生成和维护系统性能统计。