Using a predictive model to automatically enhance audio having various audio quality issues

Invention Grant

US11514925B2 Using a predictive model to automatically enhance audio having various audio quality issues 有权

Please log in to see more content

Patent Title: Using a predictive model to automatically enhance audio having various audio quality issues
Application No.: US16863591

Application Date: 2020-04-30
Publication No.: US11514925B2

Publication Date: 2022-11-29
Inventor: Zeyu Jin , Jiaqi Su , Adam Finkelstein
Applicant: Adobe Inc. , THE TRUSTEES OF PRINCETON UNIVERSITY
Applicant Address: US CA San Jose; US NJ Princeton
Assignee: Adobe Inc.,THE TRUSTEES OF PRINCETON UNIVERSITY
Current Assignee: Adobe Inc.,THE TRUSTEES OF PRINCETON UNIVERSITY
Current Assignee Address: US CA San Jose; US NJ Princeton
Agency: Kilpatrick Townsend & Stockton LLP
Main IPC: G10L21/0364
IPC: G10L21/0364 ; G10L25/30 ; G10L25/18 ; G06N3/08 ; G06N3/04

Using a predictive model to automatically enhance audio having various audio quality issues

Abstract:

Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.

Public/Granted literature

US20210343305A1 USING A PREDICTIVE MODEL TO AUTOMATICALLY ENHANCE AUDIO HAVING VARIOUS AUDIO QUALITY ISSUES Public/Granted day:2021-11-04

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）
G10L21/02	.语音增强，例如降低噪声或消除回声（在直线传送系统中减轻回声效应入H04B3/20；免提电话中的回声抑制入H04M9/08）
G10L21/0316	..通过改变振幅
G10L21/0364	...用于提高可识度