Keyword detection without decoding
    3.
    发明授权
    Keyword detection without decoding 有权
    关键字检测无需解码

    公开(公告)号:US09378733B1

    公开(公告)日:2016-06-28

    申请号:US13860982

    申请日:2013-04-11

    Applicant: Google Inc.

    CPC classification number: G10L15/08 G10L15/02 G10L2015/088

    Abstract: Embodiments pertain to automatic speech recognition in mobile devices to establish the presence of a keyword. An audio waveform is received at a mobile device. Front-end feature extraction is performed on the audio waveform, followed by acoustic modeling, high level feature extraction, and output classification to detect the keyword. Acoustic modeling may use a neural network or a vector quantization dictionary and high level feature extraction may use pooling.

    Abstract translation: 实施例涉及移动设备中的自动语音识别以建立关键字的存在。 在移动设备处接收音频波形。 对音频波形执行前端特征提取,然后进行声学建模,高级特征提取和输出分类,以检测关键字。 声学建模可以使用神经网络或矢量量化字典,并且高级特征提取可以使用池。

    DETERMINING HOTWORD SUITABILITY
    4.
    发明申请
    DETERMINING HOTWORD SUITABILITY 有权
    确定热门适用性

    公开(公告)号:US20160133259A1

    公开(公告)日:2016-05-12

    申请号:US15002044

    申请日:2016-01-20

    Applicant: Google Inc

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于确定热词适用性。 一方面,一种方法包括接收语音数据,该语音数据编码由用户说出的候选词条,使用一个或多个预定标准评估语音数据或候选词条的转录,基于 使用一个或多个预定标准来评估语音数据或候选词条的转录,以及提供用于显示给用户的热词适合性得分的表示。

    Virtual participant-based real-time translation and transcription system for audio and video teleconferences
    5.
    发明授权
    Virtual participant-based real-time translation and transcription system for audio and video teleconferences 有权
    基于虚拟参与者的音视频电话会议实时翻译和转录系统

    公开(公告)号:US09292500B2

    公开(公告)日:2016-03-22

    申请号:US14486312

    申请日:2014-09-15

    Applicant: Google Inc.

    CPC classification number: G06F17/289 G10L15/005 H04M3/568 H04N7/155

    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.

    Abstract translation: 本公开描述了一种电话会议系统,其可以使用虚拟参与者处理器将电话会议的语言内容翻译成每个参与者的口语,而不需要额外的用户输入。 虚拟参与者处理器可以像其他参与者一样连接到电话会议。 虚拟参与者处理器可以拦截以前在参与者之间交换的所有文本或音频数据现在可被虚拟参与者处理器拦截。 在获得部分或完整的语言识别结果或进行语言偏好确定时,虚拟参与者处理器可以调用适合每个参与者的翻译引擎。 虚拟参与者处理器可将所得到的翻译发送到电话会议管理处理器。 电话会议管理处理器可将相应的翻译文本或音频数据传送给适当的参与者。

    LANGUAGE MODELING OF COMPLETE LANGUAGE SEQUENCES
    6.
    发明申请
    LANGUAGE MODELING OF COMPLETE LANGUAGE SEQUENCES 有权
    完整语言序列的语言建模

    公开(公告)号:US20140278407A1

    公开(公告)日:2014-09-18

    申请号:US13875406

    申请日:2013-05-02

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G10L15/197

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于完整语言序列的语言建模。 访问指示语言序列的训练数据,并且确定训练数据中出现每个语言序列多次的计数。 选择语言序列的适当子集,并训练语言模型的第一个组成部分。 第一组件包括用于将分数分配给所选择的语言序列的第一概率数据。 基于训练数据训练语言模型的第二组件,其中第二组件包括用于将分数分配给不包括在所选语言序列中的语言序列的第二概率数据。 生成相对于第一概率数据归一化第二概率数据的调整数据,并且存储第一分量,第二分量和调整数据。

    Context-based speech recognition
    7.
    发明授权
    Context-based speech recognition 有权
    基于语境的语音识别

    公开(公告)号:US09311915B2

    公开(公告)日:2016-04-12

    申请号:US14030265

    申请日:2013-09-18

    Applicant: Google Inc.

    CPC classification number: G10L15/16

    Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.

    Abstract translation: 处理系统接收编码话音的一部分的音频信号。 处理系统接收与话语相关联的上下文信息,其中上下文信息不是从音频信号或任何其它音频信号导出的。 处理系统作为神经网络的输入提供对应于音频信号和上下文信息的数据,并且基于至少神经网络的输出来产生用于话语的转录。

    Methods and systems for sharing of adapted voice profiles
    9.
    发明授权
    Methods and systems for sharing of adapted voice profiles 有权
    用于共享适应语音配置文件的方法和系统

    公开(公告)号:US09117451B2

    公开(公告)日:2015-08-25

    申请号:US13872401

    申请日:2013-04-29

    Applicant: Google Inc.

    Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.

    Abstract translation: 提供了用于共享适应语音简档的方法和系统。 该方法可以包括在计算系统处接收一个或多个语音样本,并且所述一个或多个语音样本可以包括多个讲话语音。 该方法还可以包括在计算系统处确定与多个讲话话语中的说话者相关联的语音简档,并且包括说话者的适配语音。 此外,该方法可以包括在计算系统处接收与所确定的语音简档相关联的授权简档,并且授权简档可以包括与一个或多个相应用户相关联的一个或多个用户标识符。 此外,该方法可以包括至少部分地基于授权简档而将语音简档提供给与一个或多个相应用户相关联的至少一个计算设备的计算系统。

    CONTEXT-BASED SPEECH RECOGNITION
    10.
    发明申请
    CONTEXT-BASED SPEECH RECOGNITION 有权
    基于语境的语音识别

    公开(公告)号:US20150039299A1

    公开(公告)日:2015-02-05

    申请号:US14030265

    申请日:2013-09-18

    Applicant: Google Inc.

    CPC classification number: G10L15/16

    Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.

    Abstract translation: 处理系统接收编码话音的一部分的音频信号。 处理系统接收与话语相关联的上下文信息,其中上下文信息不是从音频信号或任何其它音频信号导出的。 处理系统作为神经网络的输入提供对应于音频信号和上下文信息的数据,并且基于至少神经网络的输出来产生用于话语的转录。

Patent Agency Ranking