DEEP NEURAL NETWORK BASED SPEECH ENHANCEMENT

    公开(公告)号:US20190385630A1

    公开(公告)日:2019-12-19

    申请号:US16442279

    申请日:2019-06-14

    Abstract: A computer may segment a noisy audio signal into audio frames and execute a deep neural network (DNN) to estimate an instantaneous function of clean speech spectrum and noisy audio spectrum in the audio frame. This instantaneous function may correspond to a ratio of an a-priori signal to noise ratio (SNR) and an a-posteriori SNR of the audio frame. The computer may add estimated instantaneous function to the original noisy audio frame to output an enhanced speech audio frame.

    Cross-channel enrollment and authentication of voice biometrics

    公开(公告)号:US12266368B2

    公开(公告)日:2025-04-01

    申请号:US17165180

    申请日:2021-02-02

    Abstract: Embodiments described herein provide for systems and methods for voice-based cross-channel enrollment and authentication. The systems control for and mitigate against variations in audio signals received across any number of communications channels by training and employing a neural network architecture comprising a speaker verification neural network and a bandwidth expansion neural network. The bandwidth expansion neural network is trained on narrowband audio signals to produce and generate estimated wideband audio signals corresponding to the narrowband audio signals. These estimated wideband audio signals may be fed into one or more downstream applications, such as the speaker verification neural network or embedding extraction neural network. The speaker verification neural network can then compare and score inbound embeddings for a current call against enrolled embeddings, regardless of the channel used to receive the inbound signal or enrollment signal.

    Deep neural network based speech enhancement

    公开(公告)号:US11756564B2

    公开(公告)日:2023-09-12

    申请号:US16442279

    申请日:2019-06-14

    CPC classification number: G10L21/0232 G06N3/048 G10L25/30

    Abstract: A computer may segment a noisy audio signal into audio frames and execute a deep neural network (DNN) to estimate an instantaneous function of clean speech spectrum and noisy audio spectrum in the audio frame. This instantaneous function may correspond to a ratio of an a-priori signal to noise ratio (SNR) and an a-posteriori SNR of the audio frame. The computer may add estimated instantaneous function to the original noisy audio frame to output an enhanced speech audio frame.

Patent Agency Ranking