Robust spoofing detection system using deep residual neural networks

    公开(公告)号:US11862177B2

    公开(公告)日:2024-01-02

    申请号:US17155851

    申请日:2021-01-22

    CPC classification number: G10L17/18 G10L17/02 G10L17/04 G10L17/08 G10L17/22

    Abstract: Embodiments described herein provide for systems and methods for implementing a neural network architecture for spoof detection in audio signals. The neural network architecture contains a layers defining embedding extractors that extract embeddings from input audio signals. Spoofprint embeddings are generated for particular system enrollees to detect attempts to spoof the enrollee's voice. Optionally, voiceprint embeddings are generated for the system enrollees to recognize the enrollee's voice. The voiceprints are extracted using features related to the enrollee's voice. The spoofprints are extracted using features related to features of how the enrollee speaks and other artifacts. The spoofprints facilitate detection of efforts to fool voice biometrics using synthesized speech (e.g., deepfakes) that spoof and emulate the enrollee's voice.

    Channel-compensated low-level features for speaker recognition

    公开(公告)号:US11657823B2

    公开(公告)日:2023-05-23

    申请号:US17107496

    申请日:2020-11-30

    CPC classification number: G10L17/20 G10L17/02 G10L17/04 G10L17/18 G10L19/028

    Abstract: A system for generating channel-compensated features of a speech signal includes a channel noise simulator that degrades the speech signal, a feed forward convolutional neural network (CNN) that generates channel-compensated features of the degraded speech signal, and a loss function that computes a difference between the channel-compensated features and handcrafted features for the same raw speech signal. Each loss result may be used to update connection weights of the CNN until a predetermined threshold loss is satisfied, and the CNN may be used as a front-end for a deep neural network (DNN) for speaker recognition/verification. The DNN may include convolutional layers, a bottleneck features layer, multiple fully-connected layers and an output layer. The bottleneck features may be used to update connection weights of the convolutional layers, and dropout may be applied to the convolutional layers.

    DEEP NEURAL NETWORK BASED SPEECH ENHANCEMENT

    公开(公告)号:US20190385630A1

    公开(公告)日:2019-12-19

    申请号:US16442279

    申请日:2019-06-14

    Abstract: A computer may segment a noisy audio signal into audio frames and execute a deep neural network (DNN) to estimate an instantaneous function of clean speech spectrum and noisy audio spectrum in the audio frame. This instantaneous function may correspond to a ratio of an a-priori signal to noise ratio (SNR) and an a-posteriori SNR of the audio frame. The computer may add estimated instantaneous function to the original noisy audio frame to output an enhanced speech audio frame.

    Channel-compensated low-level features for speaker recognition

    公开(公告)号:US10347256B2

    公开(公告)日:2019-07-09

    申请号:US15709024

    申请日:2017-09-19

    Abstract: A system for generating channel-compensated features of a speech signal includes a channel noise simulator that degrades the speech signal, a feed forward convolutional neural network (CNN) that generates channel-compensated features of the degraded speech signal, and a loss function that computes a difference between the channel-compensated features and handcrafted features for the same raw speech signal. Each loss result may be used to update connection weights of the CNN until a predetermined threshold loss is satisfied, and the CNN may be used as a front-end for a deep neural network (DNN) for speaker recognition/verification. The DNN may include convolutional layers, a bottleneck features layer, multiple fully-connected layers and an output layer. The bottleneck features may be used to update connection weights of the convolutional layers, and dropout may be applied to the convolutional layers.

    Cross-channel enrollment and authentication of voice biometrics

    公开(公告)号:US12266368B2

    公开(公告)日:2025-04-01

    申请号:US17165180

    申请日:2021-02-02

    Abstract: Embodiments described herein provide for systems and methods for voice-based cross-channel enrollment and authentication. The systems control for and mitigate against variations in audio signals received across any number of communications channels by training and employing a neural network architecture comprising a speaker verification neural network and a bandwidth expansion neural network. The bandwidth expansion neural network is trained on narrowband audio signals to produce and generate estimated wideband audio signals corresponding to the narrowband audio signals. These estimated wideband audio signals may be fed into one or more downstream applications, such as the speaker verification neural network or embedding extraction neural network. The speaker verification neural network can then compare and score inbound embeddings for a current call against enrolled embeddings, regardless of the channel used to receive the inbound signal or enrollment signal.

    Speaker recognition with quality indicators

    公开(公告)号:US12190905B2

    公开(公告)日:2025-01-07

    申请号:US17408281

    申请日:2021-08-20

    Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.

Patent Agency Ranking