Self-supervised speech representations for fake audio detection

Invention Grant

US11756572B2 Self-supervised speech representations for fake audio detection 有权

Please log in to see more content

Patent Title: Self-supervised speech representations for fake audio detection
Application No.: US17110278

Application Date: 2020-12-02
Publication No.: US11756572B2

Publication Date: 2023-09-12
Inventor: Joel Shor , Alanna Foster Slocum
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Honigman LLP
Agent Brett A. Krueger; Grant J. Griffith
Main IPC: G10L25/69
IPC: G10L25/69 ; G10L15/02 ; G10L15/06 ; G10L15/22

Self-supervised speech representations for fake audio detection

Abstract:

A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

Public/Granted literature

US20220172739A1 Self-Supervised Speech Representations for Fake Audio Detection Public/Granted day:2022-06-02

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L25/00	不限于组G10L 15/00-G10L 21/00的语言或者声音分析技术(当利用语音检测器来感知一些信号特殊特征的基于半导体的静噪放大器，如无信号时的感知入H03G3/34)
G10L25/48	.专门适用于特定用途
G10L25/69	..用于评估合成或解码语音信号