Neural networks for speaker verification

    公开(公告)号:US09978374B2

    公开(公告)日:2018-05-22

    申请号:US14846187

    申请日:2015-09-04

    Applicant: Google LLC

    CPC classification number: G10L17/18 G10L17/02 G10L17/04

    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

    TEXT INDEPENDENT SPEAKER RECOGNITION

    公开(公告)号:US20250131916A1

    公开(公告)日:2025-04-24

    申请号:US18965481

    申请日:2024-12-02

    Applicant: GOOGLE LLC

    Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.

    On-Device Multilingual Speech Recognition
    94.
    发明公开

    公开(公告)号:US20240331700A1

    公开(公告)日:2024-10-03

    申请号:US18191711

    申请日:2023-03-28

    Applicant: Google LLC

    CPC classification number: G10L15/26 G10L15/32

    Abstract: A method includes receiving a sequence of input audio frames and processing each corresponding input audio frame to determine a language ID event that indicates a predicted language. The method also includes obtaining speech recognition events each including a respective speech recognition result determined by a first language pack. Based on determining that the utterance includes a language switch from the first language to a second language, the method also includes loading a second language pack onto the client device and rewinding the input audio data buffered by an audio buffer to a time of the corresponding input audio frame associated with the language ID event that first indicated the second language as the predicted language. The method also includes emitting a first transcription and processing, using the second language pack loaded onto the client device, the rewound buffered audio data to generate a second transcription.

Patent Agency Ranking