-
公开(公告)号:US20230386506A1
公开(公告)日:2023-11-30
申请号:US18446623
申请日:2023-08-09
Applicant: Google LLC
Inventor: Joel Shor , Alanna Foster Slocum
CPC classification number: G10L25/69 , G10L15/02 , G10L15/063 , G10L15/22
Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.
-
公开(公告)号:US11756572B2
公开(公告)日:2023-09-12
申请号:US17110278
申请日:2020-12-02
Applicant: Google LLC
Inventor: Joel Shor , Alanna Foster Slocum
CPC classification number: G10L25/69 , G10L15/02 , G10L15/063 , G10L15/22
Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.
-
13.
公开(公告)号:US11710300B2
公开(公告)日:2023-07-25
申请号:US16159093
申请日:2018-10-12
Applicant: Google LLC
Inventor: Joel Shor , Sergio Guadarrama Cotado
IPC: G06V10/82 , G06F9/448 , G06N3/08 , G06N3/084 , G06F18/40 , G06F18/2413 , G06N3/045 , G06N3/047 , G06N3/082
CPC classification number: G06V10/82 , G06F9/448 , G06F18/2414 , G06F18/40 , G06N3/045 , G06N3/047 , G06N3/08 , G06N3/084 , G06N3/082
Abstract: Computing systems that provide a modularized infrastructure for training Generative Adversarial Networks (GANs) are provided herein. For example, the modularized infrastructure can include a lightweight library designed to make it easy to train and evaluate GANs. A user can interact with and/or build upon the modularized infrastructure to easily train GANs. The modularized infrastructure can include a number of distinct sets of code that handle various stages of and operations within the GAN training process. The sets of code can be modular. That is, the sets of code can be designed to exist independently yet be easily and intuitively combinable. Thus, the user can employ some or all of the sets of code or can replace a certain set of code with a set of custom-code while still generating a workable combination.
-
14.
公开(公告)号:US20190138847A1
公开(公告)日:2019-05-09
申请号:US16159093
申请日:2018-10-12
Applicant: Google LLC
Inventor: Joel Shor , Sergio Guadarrama Cotado
Abstract: Example aspects of the present disclosure are directed to computing systems that provide a modularized infrastructure for training Generative Adversarial Networks (GANs). For example, the modularized infrastructure can include a lightweight library designed to make it easy to train and evaluate GANs. A user can interact with and/or build upon the modularized infrastructure to easily train GANs. According to one aspect of the present disclosure, the modularized infrastructure can include a number of distinct sets of code that handle various stages of and operations within the GAN training process. The sets of code can be modular. That is, the sets of code can be designed to exist independently yet be easily and intuitively combinable. Thus, the user can employ some or all of the sets of code or can replace a certain set of code with a set of custom-code while still generating a workable combination.
-
-
-