SUB-MATRIX INPUT FOR NEURAL NETWORK LAYERS
    11.
    发明申请
    SUB-MATRIX INPUT FOR NEURAL NETWORK LAYERS 审中-公开
    神经网络层的子矩阵输入

    公开(公告)号:US20160217367A1

    公开(公告)日:2016-07-28

    申请号:US14613493

    申请日:2015-02-04

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes generating, by a speech recognition system, a matrix from a predetermined quantity of vectors that each represent input for a layer of a neural network, generating a plurality of sub-matrices from the matrix, using, for each of the sub-matrices, the respective sub-matrix as input to a node in the layer of the neural network to determine whether an utterance encoded in an audio signal comprises a keyword for which the neural network is trained.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于训练神经网络的计算机程序。 方法之一包括通过语音识别系统从预定量的向量生成矩阵,每个向量表示神经网络的层的输入,从矩阵生成多个子矩阵,对于每个 子矩阵,相应的子矩阵作为对神经网络层中的节点的输入,以确定在音频信号中编码的话语是否包括训练神经网络的关键字。

    CLUSTER SPECIFIC SPEECH MODEL
    12.
    发明申请
    CLUSTER SPECIFIC SPEECH MODEL 有权
    集群特定语音模型

    公开(公告)号:US20150269931A1

    公开(公告)日:2015-09-24

    申请号:US14663610

    申请日:2015-03-20

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G10L15/183 G10L2015/0631

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving data representing acoustic characteristics of a user's voice; selecting a cluster for the data from among a plurality of clusters, where each cluster includes a plurality of vectors, and where each cluster is associated with a speech model trained by a neural network using at least one or more vectors of the plurality of vectors in the respective cluster; and in response to receiving one or more utterances of the user, providing the speech model associated with the cluster for transcribing the one or more utterances

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于接收表示用户声音的声学特性的数据; 从多个聚类中选择用于数据的聚类,其中每个聚类包括多个向量,并且其中每个聚类与使用所述多个向量的至少一个或多个向量的由神经网络训练的语音模型相关联 各集群; 并且响应于接收到所述用户的一个或多个话语,提供与所述群集相关联的语音模型以用于转录所述一个或多个话语

    Neural Networks for Speaker Verification
    13.
    发明申请

    公开(公告)号:US20190043508A1

    公开(公告)日:2019-02-07

    申请号:US15666806

    申请日:2017-08-02

    Applicant: Google Inc.

    Abstract: Systems, methods, devices, and other techniques for training and using a speaker verification neural network. A computing device may receive data that characterizes a first utterance. The computing device provides the data that characterizes the utterance to a speaker verification neural network. Subsequently, the computing device obtains, from the speaker verification neural network, a speaker representation that indicates speaking characteristics of a speaker of the first utterance. The computing device determines whether the first utterance is classified as an utterance of a registered user of the computing device. In response to determining that the first utterance is classified as an utterance of the registered user of the computing device, the device may perform an action for the registered user of the computing device.

    SPEAKER IDENTIFICATION
    15.
    发明申请

    公开(公告)号:US20170287487A1

    公开(公告)日:2017-10-05

    申请号:US15624760

    申请日:2017-06-16

    Applicant: Google Inc.

    CPC classification number: G10L17/02 G10L17/005 G10L17/08 G10L17/18 G10L25/51

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, data identifying a media item including speech of a speaker is received. Based on the received data, one or more other media items that include speech of the speaker are identified. One or more search results are generated that each reference a respective media item of the one or more other media items that include speech of the speaker. The one or more search results are provided for display.

    Neural Networks For Speaker Verification
    17.
    发明申请
    Neural Networks For Speaker Verification 有权
    用于演讲者验证的神经网络

    公开(公告)号:US20170069327A1

    公开(公告)日:2017-03-09

    申请号:US14846187

    申请日:2015-09-04

    Applicant: Google Inc.

    CPC classification number: G10L17/18 G10L17/02 G10L17/04

    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

    Abstract translation: 本文件通常描述与扬声器验证相关的系统,方法,设备和其他技术,包括(i)训练用于说话者验证模型的神经网络,(ii)在客户端设备上注册用户,以及(iii)验证用户的身份 基于用户声音的特点。 一些实现包括计算机实现的方法。 该方法可以包括在计算设备处接收表征计算设备的用户的话语的数据。 可以在计算设备处产生使用计算设备上的神经网络的话语的扬声器表示。 可以基于多个训练样本来训练神经网络,每个训练样本:(i)包括表征第一话语的数据和表征一个或多个第二话语的数据,以及(ii)被标记为匹配的说话者样本或非 匹配音箱样品。

    SPEAKER RECOGNITION USING NEURAL NETWORKS
    18.
    发明申请
    SPEAKER RECOGNITION USING NEURAL NETWORKS 审中-公开
    使用神经网络的扬声器识别

    公开(公告)号:US20160293167A1

    公开(公告)日:2016-10-06

    申请号:US15179717

    申请日:2016-06-10

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker verification. In one aspect, a method includes accessing a neural network having an input layer that provides inputs to a first hidden layer whose nodes are respectively connected to only a proper subset of the inputs from the input layer. Speech data that corresponds to a particular utterance may be provided as input to the input layer of the neural network. A representation of activations that occur in response to the speech data at a particular layer of the neural network that was configured as a hidden layer during training of the neural network may be generated. A determination of whether the particular utterance was likely spoken by a particular speaker may be made based at least on the generated representation. An indication of whether the particular utterance was likely spoken by the particular speaker may be provided.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行说话者验证的计算机程序。 一方面,一种方法包括访问具有输入层的神经网络,所述输入层向第一隐藏层提供输入,所述第一隐藏层的节点仅分别连接到来自输入层的输入的适当子集。 可以将对应于特定话语的语音数据提供给神经网络的输入层的输入。 可以生成在神经网络的训练期间被配置为隐藏层的神经网络的特定层响应于语音数据而发生的激活的表示。 可以至少基于所生成的表示来确定特定说话者是否可能说出特定话语的确定。 可以提供特定说话者是否可能说出特定话语的指示。

    SPEAKER IDENTIFICATION
    19.
    发明申请
    SPEAKER IDENTIFICATION 审中-公开
    扬声器识别

    公开(公告)号:US20160275953A1

    公开(公告)日:2016-09-22

    申请号:US15170264

    申请日:2016-06-01

    Applicant: Google Inc.

    CPC classification number: G10L17/02 G10L17/005 G10L17/08 G10L17/18 G10L25/51

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, data identifying a media item including speech of a speaker is received. Based on the received data, one or more other media items that include speech of the speaker are identified. One or more search results are generated that each reference a respective media item of the one or more other media items that include speech of the speaker. The one or more search results are provided for display.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行说话人识别的计算机程序。 在一些实现中,接收识别包括说话者的语音的媒体项目的数据。 基于接收的数据,识别包括说话者的语音的一个或多个其他媒体项目。 生成一个或多个搜索结果,每个引用包括说话者的语音的一个或多个其他媒体项的相应媒体项。 一个或多个搜索结果被提供用于显示。

    Cluster specific speech model
    20.
    发明授权
    Cluster specific speech model 有权
    集群特定语音模型

    公开(公告)号:US09401143B2

    公开(公告)日:2016-07-26

    申请号:US14663610

    申请日:2015-03-20

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G10L15/183 G10L2015/0631

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving data representing acoustic characteristics of a user's voice; selecting a cluster for the data from among a plurality of clusters, where each cluster includes a plurality of vectors, and where each cluster is associated with a speech model trained by a neural network using at least one or more vectors of the plurality of vectors in the respective cluster; and in response to receiving one or more utterances of the user, providing the speech model associated with the cluster for transcribing the one or more utterances.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于接收表示用户声音的声学特性的数据; 从多个聚类中选择用于数据的聚类,其中每个聚类包括多个向量,并且其中每个聚类与使用所述多个向量的至少一个或多个向量的由神经网络训练的语音模型相关联 各集群; 并且响应于接收到所述用户的一个或多个话语,提供与所述群集相关联的语音模型以用于转录所述一个或多个话语。

Patent Agency Ranking