SEGMENT-BASED SPEAKER VERIFICATION USING DYNAMICALLY GENERATED PHRASES

    公开(公告)号:US20180308492A1

    公开(公告)日:2018-10-25

    申请号:US16017690

    申请日:2018-06-25

    Applicant: Google LLC

    CPC classification number: G10L17/24 G10L15/02 G10L17/04 G10L2015/025

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

    Generating coded data representations using neural networks and vector quantizers

    公开(公告)号:US12198710B2

    公开(公告)日:2025-01-14

    申请号:US18400992

    申请日:2023-12-29

    Applicant: Google LLC

    Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. According to one aspect, there is provided a method comprising: receiving a new input; processing the new input using an encoder neural network to generate a feature vector representing the new input; and generating a coded representation of the feature vector using a sequence of vector quantizers that are each associated with a respective codebook of code vectors, wherein the coded representation of the feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector.

    Automated mining of real-world audio training data

    公开(公告)号:US12106748B2

    公开(公告)日:2024-10-01

    申请号:US17769624

    申请日:2019-11-18

    Applicant: Google LLC

    Inventor: Dominik Roblek

    Abstract: Methods, systems, and apparatus, for generated labeled training examples for machine learning. In one aspect, a method includes receiving sets of audio recordings by a user device. For each set of audio recordings, each audio recording in the set is recorded over a respective separate microphone in the user device during a particular time interval, and each particular time interval is different for each set of audio recordings. For each set of audio recordings, a detector determines whether an audio recording in the set of audio recordings includes a particular audio feature, and whether another one of the audio recordings does not include the particular audio feature. For each set of audio recordings determined to include an audio recording that includes the particular audio feature and to include another audio recording that does not include the particular audio feature, a labeled training example is generated.

    Machine Learning Based Enhancement of Audio for a Voice Call

    公开(公告)号:US20240153514A1

    公开(公告)日:2024-05-09

    申请号:US18548949

    申请日:2021-03-05

    Applicant: Google LLC

    CPC classification number: G10L19/06 G10L19/167 G10L25/30 G10L25/69

    Abstract: Apparatus and methods related to enhancement of audio content are provided. An example method includes receiving, by a computing device and via a communications network interface, a compressed audio data frame, wherein the compressed audio data frame is received after transmission over a communications network, The method further includes decompressing the compressed audio data frame to extract an audio waveform. The method also includes predicting, by applying a neural network to the audio waveform, an enhanced version of the audio waveform, wherein the neural network has been trained on (i) a ground truth sample comprising unencoded audio waveforms prior to compression by an audio encoder, and (ii) a training dataset comprising decoded audio waveforms after compression of the unencoded audio waveforms by the audio encoder. The method additionally includes providing, by an audio output component of the computing device, the enhanced version of the audio waveform.

    COMPRESSING AUDIO WAVEFORMS USING NEURAL NETWORKS AND VECTOR QUANTIZERS

    公开(公告)号:US20230186927A1

    公开(公告)日:2023-06-15

    申请号:US18106094

    申请日:2023-02-06

    Applicant: Google LLC

    Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. One of the methods includes receiving an audio waveform that includes a respective audio sample for each of a plurality of time steps, processing the audio waveform using an encoder neural network to generate a plurality of feature vectors representing the audio waveform, generating a respective coded representation of each of the plurality of feature vectors using a plurality of vector quantizers that are each associated with a respective codebook of code vectors, wherein the respective coded representation of each feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector, and generating a compressed representation of the audio waveform by compressing the respective coded representation of each of the plurality of feature vectors.

Patent Agency Ranking