Multilingual, acoustic deep neural networks
    1.
    发明授权
    Multilingual, acoustic deep neural networks 有权
    多语言,声学深层神经网络

    公开(公告)号:US09460711B1

    公开(公告)日:2016-10-04

    申请号:US13862541

    申请日:2013-04-15

    Applicant: Google Inc.

    CPC classification number: G10L15/16 G10L15/063 G10L15/144

    Abstract: Methods and systems for processing multilingual DNN acoustic models are described. An example method may include receiving training data that includes a respective training data set for each of two or more or languages. A multilingual deep neural network (DNN) acoustic model may be processed based on the training data. The multilingual DNN acoustic model may include a feedforward neural network having multiple layers of one or more nodes. Each node of a given layer may connect with a respective weight to each node of a subsequent layer, and the multiple layers of one or more nodes may include one or more shared hidden layers of nodes and a language-specific output layer of nodes corresponding to each of the two or more languages. Additionally, weights associated with the multiple layers of one or more nodes of the processed multilingual DNN acoustic model may be stored in a database.

    Abstract translation: 描述了处理多语言DNN声学模型的方法和系统。 示例性方法可以包括接收包括用于两种或多种或多种语言中的每一种的相应训练数据集的训练数据。 可以基于训练数据处理多语言深层神经网络(DNN)声学模型。 多语言DNN声学模型可以包括具有一个或多个节点的多个层的前馈神经网络。 给定层的每个节点可以将相应权重连接到后续层的每个节点,并且一个或多个节点的多个层可以包括节点的一个或多个共享隐藏层和对应于节点的语言特定输出层 每种两种或多种语言。 另外,与经处理的多语言DNN声学模型的一个或多个节点的多个层相关联的权重可以存储在数据库中。

    Multiple subspace discriminative feature training
    2.
    发明授权
    Multiple subspace discriminative feature training 有权
    多个子空间辨别特征训练

    公开(公告)号:US09009044B1

    公开(公告)日:2015-04-14

    申请号:US13762294

    申请日:2013-02-07

    Applicant: Google Inc.

    CPC classification number: G10L15/02

    Abstract: Methods and apparatus related to speech recognition performed by a speech recognition device are disclosed. The speech recognition device can receive a plurality of samples corresponding to an utterance and generate a feature vector z from the plurality of samples. The speech recognition device can select a first frame y0 from the feature vector z, and can generate a second frame y1, where y0 and y1 differ. The speech recognition device can generate a modified frame x′ based on the first frame y0 and the second frame y1 and then recognize speech related to the utterance based on the modified frame x′. The recognized speech can be output by the speech recognition device.

    Abstract translation: 公开了由语音识别装置执行的与语音识别有关的方法和装置。 语音识别装置可以接收与发音对应的多个样本,并从多个样本生成特征向量z。 语音识别装置可以从特征向量z中选择第一帧y0,并且可以生成第二帧y1,其中y0和y1不同。 语音识别装置可以基于第一帧y0和第二帧y1生成经修改的帧x',然后基于修改的帧x'来识别与话语相关的语音。 所识别的语音可由语音识别装置输出。

    Curriculum learning for speech recognition
    3.
    发明授权
    Curriculum learning for speech recognition 有权
    课程学习语音识别

    公开(公告)号:US09202464B1

    公开(公告)日:2015-12-01

    申请号:US13859692

    申请日:2013-04-09

    Applicant: Google Inc.

    CPC classification number: G06N3/02 G06N3/08 G10L15/063 G10L2015/0631

    Abstract: Methods and apparatus related to training speech recognition devices are presented. A computing device receives training samples for training a neural network to learn an acoustic speech model. A curriculum function for speech modeling can be determined. For each training sample of the training samples, a corresponding curriculum function value for the training sample can be determined using the curriculum function. The training samples can be ordered based on the corresponding curriculum function values. In some embodiments, the neural network can be trained utilizing the ordered training samples. The trained neural network can receive an input of a second plurality of samples corresponding to human speech, where the second plurality of samples differs from the training samples. In response to receiving the second plurality of samples, the trained neural network can generate a plurality of phones corresponding to the captured human speech.

    Abstract translation: 提出了与训练语音识别装置相关的方法和装置。 计算设备接收用于训练神经网络的训练样本以学习声学语音模型。 可以确定语音建模的课程功能。 对于训练样本的每个训练样本,可以使用课程功能确定训练样本的相应课程功能值。 训练样本可以根据相应的课程功能值进行排序。 在一些实施例中,可以使用有序训练样本训练神经网络。 所训练的神经网络可以接收对应于人类语音的第二多个样本的输入,其中第二多个样本与训练样本不同。 响应于接收到第二多个样本,经训练的神经网络可以产生对应于所捕获的人类语音的多个电话。

Patent Agency Ranking