Selective Feedback For Text Recognition Systems
    1.
    发明申请
    Selective Feedback For Text Recognition Systems 有权
    文本识别系统的选择性反馈

    公开(公告)号:US20130080164A1

    公开(公告)日:2013-03-28

    申请号:US13627744

    申请日:2012-09-26

    IPC分类号: G10L15/26

    CPC分类号: G06F17/273

    摘要: This specification describes technologies relating to recognition of text in various media. In general, one aspect of the subject matter described in this specification can be embodied in methods that include receiving an input signal including data representing one or more words and passing the input signal to a text recognition system that generates a recognized text string based on the input signal. The methods may further include receiving the recognized text string from the text recognition system. The methods may further include presenting the recognized text string to a user and receiving a corrected text string based on input from the user. The methods may further include checking if an edit distance between the corrected text string and the recognized text string is below a threshold. If the edit distance is below the threshold, the corrected text string may be passed to the text recognition system for training purposes.

    摘要翻译: 本说明书描述了在各种媒体中识别文本的技术。 通常,本说明书中描述的主题的一个方面可以体现在包括接收包括表示一个或多个字的数据的输入信号并将输入信号传递到文本识别系统的方法中,所述文本识别系统基于 输入信号。 所述方法还可以包括从文本识别系统接收所识别的文本串。 所述方法还可以包括将识别的文本串呈现给用户,并且基于来自用户的输入接收经校正的文本串。 所述方法还可以包括检查所述经修正的文本串与所识别的文本串之间的编辑距离是否低于阈值。 如果编辑距离低于阈值,则为了训练目的,校正的文本串可以被传递到文本识别系统。

    Interactive text editing
    2.
    发明授权
    Interactive text editing 有权
    交互式文字编辑

    公开(公告)号:US08290772B1

    公开(公告)日:2012-10-16

    申请号:US13270927

    申请日:2011-10-11

    IPC分类号: G10L15/00

    摘要: A method for providing suggestions includes capturing audio that includes speech and receiving textual content from a speech recognition engine. The speech recognition engine performs speech recognition on the audio signal to obtain the textual content, which includes one or more passages. The method also includes receiving a selection of a portion of a first word in a passage in the textual content, wherein the passage includes multiple words, and retrieving a set of suggestions that can potentially replace the first word. At least one suggestion from the set of suggestions provides a multi-word suggestion for potentially replacing the first word. The method further includes displaying, on a display device, the set of suggestions, and highlighting a portion of the textual content, as displayed on the display device, for potentially changing to one of the suggestions from the set of suggestions.

    摘要翻译: 提供建议的方法包括从语音识别引擎捕获包括语音和接收文本内容的音频。 语音识别引擎对音频信号执行语音识别以获得包括一个或多个段落的文本内容。 该方法还包括接收文本内容中的段落中的第一个单词的一部分的选择,其中该段落包括多个单词,以及检索可潜在地替换第一个单词的一组建议。 从一组建议中至少有一个建议提供了潜在替代第一个词语的多词建议。 该方法还包括在显示设备上显示该组建议,并且突出显示在显示设备上的文本内容的一部分,用于潜在地从该组建议中改变为其中一个建议。

    Interactive text editing
    3.
    发明授权
    Interactive text editing 有权
    交互式文字编辑

    公开(公告)号:US08538754B2

    公开(公告)日:2013-09-17

    申请号:US13620213

    申请日:2012-09-14

    IPC分类号: G10L15/26

    摘要: A method for providing suggestions includes capturing audio that includes speech and receiving textual content from a speech recognition engine. The speech recognition engine performs speech recognition on the audio signal to obtain the textual content, which includes one or more passages. The method also includes receiving a selection of a portion of a first word in a passage in the textual content, wherein the passage includes multiple words, and retrieving a set of suggestions that can potentially replace the first word. At least one suggestion from the set of suggestions provides a multi-word suggestion for potentially replacing the first word. The method further includes displaying, on a display device, the set of suggestions, and highlighting a portion of the textual content, as displayed on the display device, for potentially changing to one of the suggestions from the set of suggestions.

    摘要翻译: 提供建议的方法包括从语音识别引擎捕获包括语音和接收文本内容的音频。 语音识别引擎对音频信号执行语音识别以获得包括一个或多个段落的文本内容。 该方法还包括接收文本内容中的段落中的第一个单词的一部分的选择,其中该段落包括多个单词,以及检索可潜在地替换第一个单词的一组建议。 从一组建议中至少有一个建议提供了潜在替代第一个词语的多词建议。 该方法还包括在显示设备上显示该组建议,并且突出显示在显示设备上的文本内容的一部分,用于潜在地从该组建议中改变为其中一个建议。

    Interactive Text Editing
    4.
    发明申请
    Interactive Text Editing 有权
    交互式文本编辑

    公开(公告)号:US20130085754A1

    公开(公告)日:2013-04-04

    申请号:US13620213

    申请日:2012-09-14

    IPC分类号: G10L15/26 G10L11/00

    摘要: A method for providing suggestions includes capturing audio that includes speech and receiving textual content from a speech recognition engine. The speech recognition engine performs speech recognition on the audio signal to obtain the textual content, which includes one or more passages. The method also includes receiving a selection of a portion of a first word in a passage in the textual content, wherein the passage includes multiple words, and retrieving a set of suggestions that can potentially replace the first word. At least one suggestion from the set of suggestions provides a multi-word suggestion for potentially replacing the first word. The method further includes displaying, on a display device, the set of suggestions, and highlighting a portion of the textual content, as displayed on the display device, for potentially changing to one of the suggestions from the set of suggestions.

    摘要翻译: 提供建议的方法包括从语音识别引擎捕获包括语音和接收文本内容的音频。 语音识别引擎对音频信号执行语音识别以获得包括一个或多个段落的文本内容。 该方法还包括接收文本内容中的段落中的第一个单词的一部分的选择,其中该段落包括多个单词,以及检索可潜在地替换第一个单词的一组建议。 从一组建议中至少有一个建议提供了潜在替代第一个词语的多词建议。 该方法还包括在显示设备上显示该组建议,并且突出显示在显示设备上的文本内容的一部分,用于潜在地从该组建议中改变为其中一个建议。

    Structuring verbal commands to allow concatenation in a voice interface in a mobile device
    5.
    发明授权
    Structuring verbal commands to allow concatenation in a voice interface in a mobile device 有权
    构造语言命令以允许在移动设备中的语音接口中连接

    公开(公告)号:US08452602B1

    公开(公告)日:2013-05-28

    申请号:US13621018

    申请日:2012-09-15

    IPC分类号: G10L21/00 G10L15/04

    摘要: A spoken utterance includes at least a first level of a multi-level command format, in which the first level identifies an application. The spoken utterance may also include a second level of the multi-level command format, in which the second level identifies an action. In response to receiving the spoken utterance at a computing device, a representation of the application identified by the first level is displayed on a display of the computing device. If the spoken utterance includes the second level of the multi-level command format, the action identified by the second level is initiated. If the spoken utterance does not include the second level of the multi-level command format, the computing device waits for a predetermined period of time and provides at least one of an audible or visual action prompt if the second level is not received within the predetermined period of time.

    摘要翻译: 讲话话语包括至少第一级的多级命令格式,其中第一级识别应用。 讲话话语还可以包括多级命令格式的第二级,其中第二级标识动作。 响应于在计算设备处接收到说出的话语,在计算设备的显示器上显示由第一级标识的应用的表示。 如果说出的话语包括多级命令格式的第二级,则启动由第二级标识的动作。 如果说话话语不包括多级命令格式的第二级,则计算设备等待预定的时间段,并且如果在预定的时间段内没有接收到第二级别,则提供听觉或视觉动作提示中的至少一个 一段的时间。

    Directing dictation into input fields
    6.
    发明授权
    Directing dictation into input fields 有权
    指导输入字段

    公开(公告)号:US08255218B1

    公开(公告)日:2012-08-28

    申请号:US13245698

    申请日:2011-09-26

    IPC分类号: G10L11/00 G10L15/00 G10L17/00

    CPC分类号: G10L15/22 G06F3/167

    摘要: In general, this disclosure describes techniques to direct textual characters converted from vocal input into selected graphical user interface input fields. Vocal input may be received. Textual characters may be identified based on the vocal input. A first portion of the textual characters corresponding to a first portion of the vocal input may be graphically inputted into a first input field of a GUI. While receiving the vocal input, a selection by of a second input field in the GUI may be accepted after the first portion of the vocal input has been received. After accepting the selection of the second input field, a second portion of the textual characters corresponding to a second portion of the vocal input received after the selection of the second input field may be inputted into the second input field.

    摘要翻译: 通常,本公开描述了将从声乐输入转换的文本字符引导到选定的图形用户界面输入字段中的技术。 可以接收声音输入。 可以基于声音输入来识别文本字符。 对应于声音输入的第一部分的文本字符的第一部分可以被图形地输入到GUI的第一输入字段中。 在接收到声音输入时,在接收到声音输入的第一部分之后,可以接受GUI中的第二输入字段的选择。 在接受第二输入字段的选择之后,可以将与选择第二输入字段之后接收的声音输入的第二部分对应的文本字符的第二部分输入到第二输入字段。

    Systems And Methods For Continual Speech Recognition And Detection In Mobile Computing Devices
    9.
    发明申请
    Systems And Methods For Continual Speech Recognition And Detection In Mobile Computing Devices 有权
    用于移动计算设备中连续语音识别和检测的系统和方法

    公开(公告)号:US20130085755A1

    公开(公告)日:2013-04-04

    申请号:US13621068

    申请日:2012-09-15

    IPC分类号: G10L15/26

    摘要: The present application describes systems, articles of manufacture, and methods for continuous speech recognition for mobile computing devices. One embodiment includes determining whether a mobile computing device is receiving operating power from an external power source or a battery power source, and activating a trigger word detection subroutine in response to determining that the mobile computing device is receiving power from the external power source. In some embodiments, the trigger word detection subroutine operates continually while the mobile computing device is receiving power from the external power source. The trigger word detection subroutine includes determining whether a plurality of spoken words received via a microphone includes one or more trigger words, and in response to determining that the plurality of spoken words includes at least one trigger word, launching an application corresponding to the at least one trigger word included in the plurality of spoken words.

    摘要翻译: 本申请描述了用于移动计算设备的连续语音识别的系统,制品和方法。 一个实施例包括确定移动计算设备是否从外部电源或电池电源接收工作电力,以及响应于确定移动计算设备正在从外部电源接收电力而激活触发字检测子程序。 在一些实施例中,触发字检测子程序在移动计算设备正在从外部电源接收电力的同时工作。 触发词检测子程序包括确定通过麦克风接收的多个口语单词是否包括一个或多个触发词,并且响应于确定所述多个口语单词包括至少一个触发词,启动与至少一个对应的应用程序 一个触发词包括在多个口语中。

    Hybrid Client/Server Speech Recognition In A Mobile Device
    10.
    发明申请
    Hybrid Client/Server Speech Recognition In A Mobile Device 审中-公开
    移动设备中的混合客户端/服务器语音识别

    公开(公告)号:US20130085753A1

    公开(公告)日:2013-04-04

    申请号:US13586696

    申请日:2012-08-15

    IPC分类号: G10L15/20

    摘要: A computing device is able to use an embedded speech recognizer and a network speech recognizer for speech recognition. In response to detecting speech in the captured audio, the computing device may forward the captured audio to its embedded speech recognizer and to a speech client for the network speech recognizer. The embedded speech recognizer provides an embedded-recognizer result for the captured audio. If a network-recognition criterion is met, the speech client forwards the captured audio to the network speech recognizer and receives a network-recognizer result for the captured audio from the network speech recognizer. A speech recognition result for the captured audio is forwarded to at least one application, wherein the speech recognition result is based on at least one of the embedded-recognizer result and the network-recognizer result.

    摘要翻译: 计算设备能够使用嵌入式语音识别器和用于语音识别的网络语音识别器。 响应于在捕获的音频中检测到语音,计算设备可以将捕获的音频转发到其嵌入式语音识别器和用于网络语音识别器的语音客户端。 嵌入式语音识别器为捕获的音频提供嵌入式识别器结果。 如果满足网络识别标准,则话音客户端将所捕获的音频转发到网络语音识别器,并从网络语音识别器接收所捕获的音频的网络识别器结果。 将捕获的音频的语音识别结果转发到至少一个应用,其中语音识别结果基于嵌入式识别器结果和网络识别器结果中的至少一个。