Neural Networks For Speaker Verification
    12.
    发明申请

    公开(公告)号:US20180315430A1

    公开(公告)日:2018-11-01

    申请号:US15966667

    申请日:2018-04-30

    Applicant: Google LLC

    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

    Neural networks for speaker verification

    公开(公告)号:US11961525B2

    公开(公告)日:2024-04-16

    申请号:US17444384

    申请日:2021-08-03

    Applicant: Google LLC

    CPC classification number: G10L17/18 G10L17/02 G10L17/04

    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

    PROCESSING AND GENERATING SETS USING RECURRENT NEURAL NETWORKS

    公开(公告)号:US20220180151A1

    公开(公告)日:2022-06-09

    申请号:US17679625

    申请日:2022-02-24

    Applicant: Google LLC

    Abstract: In one aspect, this specification describes a recurrent neural network system implemented by one or more computers that is configured to process input sets to generate neural network outputs for each input set. The input set can be a collection of multiple inputs for which the recurrent neural network should generate the same neural network output regardless of the order in which the inputs are arranged in the collection. The recurrent neural network system can include a read neural network, a process neural network, and a write neural network. In another aspect, this specification describes a system implemented as computer programs on one or more computers in one or more locations that is configured to train a recurrent neural network that receives a neural network input and sequentially emits outputs to generate an output sequence for the neural network input.

    Complex linear projection for acoustic modeling

    公开(公告)号:US10140980B2

    公开(公告)日:2018-11-27

    申请号:US15386979

    申请日:2016-12-21

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Patent Agency Ranking