Robust speech recognition
    2.
    发明授权
    Robust speech recognition 有权
    强大的语音识别

    公开(公告)号:US08682661B1

    公开(公告)日:2014-03-25

    申请号:US12872428

    申请日:2010-08-31

    IPC分类号: G10L15/00

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的用于识别语音输入的计算机程序。 一方面,一种方法包括接收包括注释的用户输入和语法,包括音频数据的用户输入和向语法提供语法和语义的注释,检索第三方统计语音识别信息,传输的统计语音识别信息 通过网络生成基于语法和统计语音识别信息的统计语言模型(SLM),语法的SLM保留语义,使用SLM处理用户输入以生成一个或多个结果,比较一个或多个 对语法中提供的候选者的结果,基于比较识别语法的特定候选者,并且向在计算设备上执行的应用提供用于输入的特定候选者。

    Virtual Participant-based Real-Time Translation and Transcription System for Audio and Video Teleconferences
    7.
    发明申请
    Virtual Participant-based Real-Time Translation and Transcription System for Audio and Video Teleconferences 有权
    基于虚拟参与者的音视频电话会议实时翻译和转录系统

    公开(公告)号:US20130226557A1

    公开(公告)日:2013-08-29

    申请号:US13459293

    申请日:2012-04-30

    IPC分类号: G06F17/28 G01K15/00

    摘要: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.

    摘要翻译: 本公开描述了一种电话会议系统,其可以使用虚拟参与者处理器将电话会议的语言内容翻译成每个参与者的口语,而不需要额外的用户输入。 虚拟参与者处理器可以像其他参与者一样连接到电话会议。 虚拟参与者处理器可以拦截以前在参与者之间交换的所有文本或音频数据现在可被虚拟参与者处理器拦截。 在获得部分或完整的语言识别结果或进行语言偏好确定时,虚拟参与者处理器可以调用适合每个参与者的翻译引擎。 虚拟参与者处理器可将所得到的翻译发送到电话会议管理处理器。 电话会议管理处理器可将相应的翻译文本或音频数据传送给适当的参与者。

    Speech to Text Conversion
    8.
    发明申请
    Speech to Text Conversion 有权
    演讲文字转换

    公开(公告)号:US20120022867A1

    公开(公告)日:2012-01-26

    申请号:US13249181

    申请日:2011-09-29

    IPC分类号: G10L15/26 G06F17/30

    摘要: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.

    摘要翻译: 描述了用于语音到文本转换的方法,计算机程序产品和系统。 从电子设备的用户接收语音输入,并且接收到在接收到语音输入时描述电子设备的上下文的语境元数据。 识别多个基本语言模型,其中每个基本语言模型对应于不同的文本语料库的内容。 使用上下文元数据,基于来自基本语言模型的贡献生成内插语言模型。 根据每个基本语言模型的加权来加权贡献。 内插语言模型用于将接收的语音输入转换为文本输出。 在远离电子设备的计算机服务器系统处接收语音输入。 文本输出被传送到电子设备。

    Mobile dictation correction user interface
    10.
    发明申请
    Mobile dictation correction user interface 审中-公开
    移动听写矫正用户界面

    公开(公告)号:US20060149551A1

    公开(公告)日:2006-07-06

    申请号:US11316347

    申请日:2005-12-22

    IPC分类号: G10L11/00

    CPC分类号: G10L15/22 G10L15/30

    摘要: A method of speech recognition is described for use with mobile user devices. A speech signal representative of input speech is forwarded from a mobile user device to a remote server. At the mobile user device, a speech recognition result representative of the speech signal is received from the remote server. The speech recognition result includes alternate recognition hypotheses associated with one or more portions of the speech recognition result. A user correction selection representing a portion of the speech recognition result is obtained from the user. The user is presented with selected alternate recognition hypotheses associated with the user correction selection. A user chosen one of the selected alternate recognition hypotheses is substituted for the user correction selection to form a corrected speech recognition result.

    摘要翻译: 描述了一种用于移动用户设备的语音识别方法。 表示输入语音的语音信号从移动用户设备转发到远程服务器。 在移动用户设备处,从远程服务器接收表示语音信号的语音识别结果。 语音识别结果包括与语音识别结果的一个或多个部分相关联的替代识别假设。 从用户获得表示语音识别结果的一部分的用户校正选择。 向用户呈现与用户校正选择相关联的选择的替代识别假设。 选择所选择的替代识别假设之一的用户代替用户校正选择以形成校正的语音识别结果。