Natural language refinement of voice and text entry
    21.
    发明授权
    Natural language refinement of voice and text entry 有权
    自然语言提炼语音和文本输入

    公开(公告)号:US09190054B1

    公开(公告)日:2015-11-17

    申请号:US13799619

    申请日:2013-03-13

    Applicant: Google Inc.

    Abstract: A data processing apparatus is configured to receive a first string related to a natural-language voice user entry and a second string including at least one natural-language refinement to the user entry; parse the first string into a first set of one or more tokens and the second string into a second set of one or more tokens; determine at least one refining instruction from the second set of one or more tokens; generate, from at least a portion of each of the first string and the second string and based on the at least one refining instruction, a group of candidate refined user entries; select a refined user entry from the group of candidate refined user entries; and output the selected, refined user entry.

    Abstract translation: 数据处理装置被配置为接收与自然语言语音用户条目相关的第一串和包含至少一个自然语言细化的用户条目的第二串; 将第一个字符串解析为第一组一个或多个令牌,将第二个字符串解析成第二组一个或多个令牌; 确定来自第二组一个或多个令牌的至少一个精炼指令; 从所述第一字符串和所述第二字符串中的每一个的至少一部分生成,并且基于所述至少一个细化指令,生成一组候选细化用户条目; 从候选精细用户条目组中选择精细用户条目; 并输出所选择的精细用户条目。

    Virtual participant-based real-time translation and transcription system for audio and video teleconferences
    24.
    发明授权
    Virtual participant-based real-time translation and transcription system for audio and video teleconferences 有权
    基于虚拟参与者的音视频电话会议实时翻译和转录系统

    公开(公告)号:US09569431B2

    公开(公告)日:2017-02-14

    申请号:US15075763

    申请日:2016-03-21

    Applicant: Google Inc.

    CPC classification number: G06F17/289 G10L15/005 H04M3/568 H04N7/155

    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.

    Abstract translation: 本公开描述了一种电话会议系统,其可以使用虚拟参与者处理器将电话会议的语言内容翻译成每个参与者的口语,而不需要额外的用户输入。 虚拟参与者处理器可以像其他参与者一样连接到电话会议。 虚拟参与者处理器可以拦截以前在参与者之间交换的所有文本或音频数据现在可被虚拟参与者处理器拦截。 在获得部分或完整的语言识别结果或进行语言偏好确定时,虚拟参与者处理器可以调用适合每个参与者的翻译引擎。 虚拟参与者处理器可将所得到的翻译发送到电话会议管理处理器。 电话会议管理处理器可将相应的翻译文本或音频数据传送给适当的参与者。

    NEURAL NETWORK FOR KEYBOARD INPUT DECODING
    25.
    发明申请
    NEURAL NETWORK FOR KEYBOARD INPUT DECODING 有权
    键盘输入解码的神经网络

    公开(公告)号:US20160299685A1

    公开(公告)日:2016-10-13

    申请号:US14683861

    申请日:2015-04-10

    Applicant: Google Inc.

    Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

    Abstract translation: 在一些示例中,计算设备包括至少一个处理器; 以及至少一个模块,可由所述至少一个处理器操作以:输出用于在输出设备处显示图形键盘; 接收在存在敏感输入设备的位置处检测到的手势的指示,其中所述存在敏感输入设备的位置对应于输出所述图形键盘的输出设备的位置; 基于由所述计算设备使用神经网络处理的手势的至少一个空间特征确定至少一个字符串,其中所述至少一个空间特征指示所述手势的至少一个物理属性; 并且至少部分地基于使用所述神经网络的所述手势的所述至少一个空间特征的处理来输出所述至少一个字符串,用于在所述输出设备处显示。

    VIRTUAL PARTICIPANT-BASED REAL-TIME TRANSLATION AND TRANSCRIPTION SYSTEM FOR AUDIO AND VIDEO TELECONFERENCES

    公开(公告)号:US20160203127A1

    公开(公告)日:2016-07-14

    申请号:US15075763

    申请日:2016-03-21

    Applicant: Google Inc.

    CPC classification number: G06F17/289 G10L15/005 H04M3/568 H04N7/155

    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.

    RECOGNIZING SPEECH USING NEURAL NETWORKS
    28.
    发明申请
    RECOGNIZING SPEECH USING NEURAL NETWORKS 有权
    使用神经网络识别语音

    公开(公告)号:US20150340034A1

    公开(公告)日:2015-11-26

    申请号:US14720113

    申请日:2015-05-22

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用神经网络识别语音。 其中一种方法包括接收音频输入; 使用声学模型处理音频输入以为多个音素标签中的每一个产生相应的音素分数; 使用反向发音模型处理一个或多个音素得分,以产生多个图形标签中的每一个的各自的图形分数; 以及使用语言模型处理一个或多个所述图形分数,以生成多个文本标签中的每一个的相应文本标签分数。

    KEYWORD DETECTION BASED ON ACOUSTIC ALIGNMENT
    29.
    发明申请
    KEYWORD DETECTION BASED ON ACOUSTIC ALIGNMENT 审中-公开
    基于声学对准的关键词检测

    公开(公告)号:US20150279351A1

    公开(公告)日:2015-10-01

    申请号:US13861020

    申请日:2013-04-11

    Applicant: Google Inc.

    CPC classification number: G10L15/08 G10L15/02 G10L2015/088

    Abstract: Embodiments pertain to automatic speech recognition in mobile devices to establish the presence of a keyword. An audio waveform is received at a mobile device. Front-end feature extraction is performed on the audio waveform, followed by acoustic modeling, high level feature extraction, and output classification to detect the keyword. Acoustic modeling may use a neural network or Gaussian mixture modeling, and high level feature extraction may be done by aligning the results of the acoustic modeling with expected event vectors that correspond to a keyword.

    Abstract translation: 实施例涉及移动设备中的自动语音识别以建立关键字的存在。 在移动设备处接收音频波形。 对音频波形执行前端特征提取,然后进行声学建模,高级特征提取和输出分类,以检测关键字。 声学建模可以使用神经网络或高斯混合建模,并且可以通过将声学建模的结果与对应于关键字的预期事件向量对齐来完成高级特征提取。

Patent Agency Ranking