SELF-SUPERVISED SPEECH REPRESENTATIONS FOR FAKE AUDIO DETECTION

    公开(公告)号:US20230386506A1

    公开(公告)日:2023-11-30

    申请号:US18446623

    申请日:2023-08-09

    Applicant: Google LLC

    CPC classification number: G10L25/69 G10L15/02 G10L15/063 G10L15/22

    Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

    Self-supervised speech representations for fake audio detection

    公开(公告)号:US11756572B2

    公开(公告)日:2023-09-12

    申请号:US17110278

    申请日:2020-12-02

    Applicant: Google LLC

    CPC classification number: G10L25/69 G10L15/02 G10L15/063 G10L15/22

    Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

    Computing Systems with Modularized Infrastructure for Training Generative Adversarial Networks

    公开(公告)号:US20190138847A1

    公开(公告)日:2019-05-09

    申请号:US16159093

    申请日:2018-10-12

    Applicant: Google LLC

    Abstract: Example aspects of the present disclosure are directed to computing systems that provide a modularized infrastructure for training Generative Adversarial Networks (GANs). For example, the modularized infrastructure can include a lightweight library designed to make it easy to train and evaluate GANs. A user can interact with and/or build upon the modularized infrastructure to easily train GANs. According to one aspect of the present disclosure, the modularized infrastructure can include a number of distinct sets of code that handle various stages of and operations within the GAN training process. The sets of code can be modular. That is, the sets of code can be designed to exist independently yet be easily and intuitively combinable. Thus, the user can employ some or all of the sets of code or can replace a certain set of code with a set of custom-code while still generating a workable combination.

Patent Agency Ranking