Method and apparatus for improving speech recognition accuracy
    1.
    发明授权
    Method and apparatus for improving speech recognition accuracy 失效
    提高语音识别精度的方法和装置

    公开(公告)号:US06675142B2

    公开(公告)日:2004-01-06

    申请号:US09960826

    申请日:2001-09-21

    IPC分类号: G10L1500

    CPC分类号: G10L15/22 G10L2015/0638

    摘要: A transcription system (100) includes a computer (102), a monitor (104), and a microphone (110). Via the microphone, a user of the system provides input speech that is received and transcribed (204) by the system. The system monitors (205) the accuracy of the transcribed speech during transcription. The system also determines (210) whether the accuracy of the transcribed speech is sufficient and, if not, automatically activates (214) a speech recognition improvement tool and alerts (212) the user that the tool has been activated.

    摘要翻译: 转录系统(100)包括计算机(102),监视器(104)和麦克风(110)。 通过麦克风,系统的用户提供由系统接收和转录(204)的输入语音。 系统在转录过程中监视(205)转录语言的准确性。 系统还确定(210)转录语音的准确性是否足够,如果不是,则自动激活(214)语音识别改进工具并且向用户警告(212)该工具已被激活。

    Method and apparatus for improving speech recognition accuracy
    2.
    发明授权
    Method and apparatus for improving speech recognition accuracy 有权
    提高语音识别精度的方法和装置

    公开(公告)号:US06370503B1

    公开(公告)日:2002-04-09

    申请号:US09345071

    申请日:1999-06-30

    IPC分类号: G10L1526

    CPC分类号: G10L15/22 G10L2015/0638

    摘要: A transcription system (100) includes a computer (102), a monitor (104), and a microphone (110). Via the microphone, a user of the system provides input speech that is received and transcribed (204) by the system. The system monitors (205) the accuracy of the transcribed speech during transcription. The system also determines (210) whether the accuracy of the transcribed speech is sufficient and, if not, automatically activates (214) a speech recognition improvement tool and alerts (212) the user that the tool has been activated. This tool could also be manually activated (206) by the user. The type of recognition problem is identified (216) by the user or automatically by the system, and the system provides (218) possible solution steps for enabling the user to adjust (219) system parameters or modify user behavior in order to alleviate the recognition problem. The system also provides the user the ability to test (222) the transcription process in order to determine whether the solution has improved the recognition accuracy.

    摘要翻译: 转录系统(100)包括计算机(102),监视器(104)和麦克风(110)。 通过麦克风,系统的用户提供由系统接收和转录(204)的输入语音。 系统在转录过程中监视(205)转录语言的准确性。 系统还确定(210)转录语音的准确性是否足够,如果不是,则自动激活(214)语音识别改进工具并且向用户警告(212)该工具已被激活。 该工具也可以由用户手动激活(206)。 识别问题的类型由用户或系统自动识别(216),并且系统提供(218)可能的解决方案步骤,以使用户能够调整(219)系统参数或修改用户行为以减轻识别 问题。 该系统还为用户提供测试(222)转录过程的能力,以确定解决方案是否提高了识别精度。

    Transcription system for multiple speakers, using and establishing identification
    3.
    发明授权
    Transcription system for multiple speakers, using and establishing identification 有权
    多个扬声器的转录系统,使用和建立识别

    公开(公告)号:US06332122B1

    公开(公告)日:2001-12-18

    申请号:US09337392

    申请日:1999-06-23

    IPC分类号: G10L1100

    CPC分类号: G10L17/00 G10L15/26

    摘要: A method and apparatus for transcribing text from multiple speakers in a computer system having a speech recognition application. The system receives speech from one of a plurality of speakers through a single channel, assigns a speaker ID to the speaker, transcribes the speech into text, and associates the speaker ID with the speech and text. In order to detect a speaker change, the system monitors the speech input through the channel for a speaker change.

    摘要翻译: 一种用于在具有语音识别应用的计算机系统中从多个扬声器转录文本的方法和装置。 系统通过单个频道从多个扬声器中的一个接收语音,向说话者分配扬声器ID,将语音转录成文本,并将扬声器ID与语音和文本相关联。 为了检测扬声器变化,系统通过通道来监视通过通道输入的语音以进行扬声器改变。

    Method and apparatus for correcting misinterpreted voice commands in a speech recognition system
    4.
    发明授权
    Method and apparatus for correcting misinterpreted voice commands in a speech recognition system 有权
    用于在语音识别系统中校正误解的语音命令的方法和装置

    公开(公告)号:US06327566B1

    公开(公告)日:2001-12-04

    申请号:US09333698

    申请日:1999-06-16

    IPC分类号: G10L1504

    摘要: An efficient method and system, particularly well-suited for correcting natural language understanding (NLU) commands, corrects spoken commands misinterpreted by a speech recognition system. The method involves a series of steps, including: receiving the spoken command from a user; parsing the command to identify a paraphrased command; displaying the paraphrased command; and accepting corrections of the paraphrased command from the user. The paraphrased command is segmented according to command language categories, which include a command action category, an action object category, and an action and/or object modifying category. The paraphrased command is displayed in a user interface window segmented into these command language categories. The user interface window also contains alternative commands for each segment of the paraphrased command.

    摘要翻译: 一种特别适合用于校正自然语言理解(NLU)命令的有效方法和系统来校正由语音识别系统误解的语音命令。 该方法涉及一系列步骤,包括:从用户接收口令命令; 解析命令来识别一个释义的命令; 显示释义的命令; 并接受来自用户的释义命令的更正。 释义的命令根据命令语言类别进行分段,其中包括命令操作类别,操作对象类别以及操作和/或对象修改类别。 释义的命令显示在分为这些命令语言类别的用户界面窗口中。 用户界面窗口还包含替代命令的每个段的替代命令。

    Method and apparatus for providing an event-based “What-Can-I-Say?” window
    5.
    发明授权
    Method and apparatus for providing an event-based “What-Can-I-Say?” window 有权
    提供基于事件的“我可以说什么”的方法和装置? 窗口

    公开(公告)号:US06308157B1

    公开(公告)日:2001-10-23

    申请号:US09328095

    申请日:1999-06-08

    IPC分类号: G10L1514

    CPC分类号: G10L15/26 G10L2015/228

    摘要: A method and system efficiently identifies voice commands for a user of a speech recognition system. The method involves a series of steps including: receiving input from a user; monitoring the computer system to log system events and ascertain a current system state; predicting a probable next event according to the current system state and logged events; and identifying acceptable voice commands to perform the next event. The system events include commands, system control activities, timed activities, and application activation. These events are statistically analyzed in light of the current system state to determine the probable next event. The voice commands for performing the probable next event are displayed to the user.

    摘要翻译: 方法和系统有效地识别用于语音识别系统的用户的语音命令。 该方法涉及一系列步骤,包括:从用户接收输入; 监视计算机系统以记录系统事件并确定当前的系统状态; 根据当前的系统状态和记录的事件预测可能的下一个事件; 并且识别可接受的语音命令以执行下一个事件。 系统事件包括命令,系统控制活动,定时活动和应用程序激活。 根据当前系统状态对这些事件进行统计分析,以确定可能的下一个事件。 用于执行可能的下一个事件的语音命令被显示给用户。

    Method and apparatus for improving speech command recognition accuracy using event-based constraints
    6.
    发明授权
    Method and apparatus for improving speech command recognition accuracy using event-based constraints 有权
    使用基于事件的约束来提高语音命令识别精度的方法和装置

    公开(公告)号:US06345254B1

    公开(公告)日:2002-02-05

    申请号:US09321918

    申请日:1999-05-29

    IPC分类号: G10L1522

    摘要: A method and system for improving the speech command recognition accuracy of a computer speech recognition system uses event-based constraints to recognize a spoken command. The constraints are system states and events, which include system activities, active applications, prior commands and an event queue. The method and system is performed by monitoring events and states of the computer system and receiving a processed command corresponding to the spoken command. The processed command is statistically analyzed in light of the system events and states as well as according to an acoustic model. The system then identifies a recognized command corresponding to the spoken command.

    摘要翻译: 用于改善计算机语音识别系统的语音命令识别精度的方法和系统使用基于事件的约束来识别语音命令。 约束是系统状态和事件,其中包括系统活动,活动应用程序,以前的命令和事件队列。 该方法和系统通过监视计算机系统的事件和状态并接收与口语命令对应的处理命令来执行。 根据系统事件和状态以及根据声学模型统计分析处理后的命令。 然后,系统识别与口语命令相对应的识别命令。

    Speech recognition enrollment for non-readers and displayless devices
    7.
    发明授权
    Speech recognition enrollment for non-readers and displayless devices 有权
    非阅读器和无显示设备的语音识别注册

    公开(公告)号:US06324507B1

    公开(公告)日:2001-11-27

    申请号:US09248243

    申请日:1999-02-10

    IPC分类号: G10L1506

    CPC分类号: G10L15/063 G10L2015/0638

    摘要: A method for enrolling a user in a speech recognition system, without requiring reading, comprises the steps of: generating an audio user interface having an audible output and an audio input; audibly playing a text phrase; audibly prompting the user to speak the played phrase; repeating the steps of audibly prompting the user not to speak, audibly playing the phrase and audibly prompting the user to speak, for a plurality of further phrases; and, processing enrollment of the user based on the audibly prompted and subsequently spoken phrases. A graphical user interface can also be generated for: displaying text corresponding to the phrases and to the audible prompts; displaying a plurality of icons for user activation; and, selectively distinguishing different ones of the icons at different times by at least one of: color; shape; and, animation.

    摘要翻译: 一种在不需要读取的情况下将用户登记在语音识别系统中的方法包括以下步骤:产生具有可听输出和音频输入的音频用户界面; 可听地播放短信; 可听见地提示用户说出播放的短语; 对于多个其他短语重复听觉地提示用户不发言,可听见地播放短语并且可听见地提示用户说话的步骤; 并且基于可听见的提示和随后的口头短语处理用户的注册。 还可以生成图形用户界面,用于:显示与短语相对应的文本和可听见的提示; 显示用于激活用户的多个图标; 并且通过以下至少一个来选择性地区分不同时间的不同图标:颜色; 形状; 和动画。

    Speech recognition correction for devices having limited or no display
    8.
    发明授权
    Speech recognition correction for devices having limited or no display 有权
    具有有限或不显示的设备的语音识别校正

    公开(公告)号:US07200555B1

    公开(公告)日:2007-04-03

    申请号:US09610061

    申请日:2000-07-05

    IPC分类号: G10L15/26

    CPC分类号: G10L15/22

    摘要: A novel apparatus and method for correcting speech recognized text in a predominantly speech-only environment for use with a device having only a limited or no display device available. The method is preferably implemented by a machine readable storage mechanism having stored thereon a computer program, the method comprising the following steps. First, audio speech input can be received and speech-to-text converted to speech recognized text. Second, a first speech correction command for performing a correction operation on speech recognized text stored in a text buffer can be detected in the speech recognized text. Third, if a speech correction command is not detected in the speech recognized text, the speech recognized text can be added to the text buffer. Fourth, if a speech command is detected in the speech recognized text, the detected correction speech command can be performed on speech recognized text stored in the text buffer.

    摘要翻译: 一种新颖的装置和方法,用于在仅主要是仅限语音环境中校正语音识别的文本,以便与仅具有有限或不显示设备的设备一起使用。 该方法优选地由其上存储有计算机程序的机器可读存储机构来实现,该方法包括以下步骤。 首先,可以接收音频语音输入并将语音到文本转换成语音识别的文本。 第二,可以在语音识别文本中检测用于对存储在文本缓冲器中的语音识别文本执行校正操作的第一语音校正命令。 第三,如果在语音识别文本中没有检测到语音校正命令,则可以将语音识别的文本添加到文本缓冲器。 第四,如果在语音识别文本中检测到语音命令,则可以对存储在文本缓冲器中的语音识别文本执行检测到的校正语音命令。

    Method for guiding text-to-speech output timing using speech recognition markers
    9.
    发明授权
    Method for guiding text-to-speech output timing using speech recognition markers 有权
    使用语音识别标记引导文字到语音输出定时的方法

    公开(公告)号:US07010489B1

    公开(公告)日:2006-03-07

    申请号:US09521593

    申请日:2000-03-09

    IPC分类号: G10L13/08

    CPC分类号: G10L13/10

    摘要: A method for guiding text-to-speech output timing with speech recognition markers can include the following steps. First, tokens can be retrieved in a TTS system. The tokens can include words, phrase markers, punctuation marks and meta-tags. Second, phrase markers can be identified among the retrieved tokens. Third, words can be identified among the retrieved tokens. Fourth, the TTS system can TTS play back the identified words. Finally, during the TTS playback of the words, the TTS system can pause in response to the identification of the phrase markers.

    摘要翻译: 用语音识别标记引导文本到语音输出定时的方法可以包括以下步骤。 首先,可以在TTS系统中检索令牌。 令牌可以包括单词,短语标记,标点符号和元标记。 第二,可以在检索到的令牌中识别短语标记。 第三,可以在检索到的令牌中识别字词。 第四,TTS系统可以使TTS播放已识别的单词。 最后,在TTS播放这些单词期间,TTS系统可以暂停响应于短语标记的识别。

    Speech recognition enrollment for non-readers and displayless devices

    公开(公告)号:US06560574B2

    公开(公告)日:2003-05-06

    申请号:US09897681

    申请日:2001-07-02

    IPC分类号: G10L1526

    CPC分类号: G10L15/063 G10L2015/0638

    摘要: A method for enrolling a user in a speech recognition system, without requiring reading, comprises the steps of: generating an audio user interface having an audible output and an audio input; audibly playing a text phrase; audibly prompting the user to speak the played phrase; repeating the steps of audibly prompting the user not to speak, audibly playing the phrase and audibly prompting the user to speak, for a plurality of further phrases; and, processing enrollment of the user based on the audibly prompted and subsequently spoken phrases. A graphical user interface can also be generated for: displaying text corresponding to the phrases and to the audible prompts; displaying a plurality of icons for user activation; and, selectively distinguishing different ones of the icons at different times by at least one of: color; shape; and, animation.