-
1.
公开(公告)号:US20240355351A1
公开(公告)日:2024-10-24
申请号:US18302079
申请日:2023-04-18
Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
Inventor: Moshe Tzur , Elior Hadad
CPC classification number: G10L25/84 , G10L15/02 , G10L15/04 , G10L21/0232 , G10L25/18 , G10L25/90 , G10L25/93
Abstract: The single-channel, Speech Features-Based Voice Activity Detection (SFVAD) system is a robust, low-latency system that generates per-frame speech and noise indications, along with calculating a pair of speech and noise time-frequency masks. The SFVAD system controls an adaptation mechanism for a Beam-Forming system control module and improves the speech quality and noise reduction capabilities of Automatic Speech Recognition applications, such as Virtual Assistance (VA) and Hands-Free (HF) calls, by robustly handling transient noises. The system extracts speech-like patterns from an input audio signal and it is invariant to the power-level of the input audio signal. Noise calculation is controlled by a pair of speech features-based detectors (voiced and unvoiced). A Cepstral-based pitch detector and a Centrum calculation method are used to prevent contamination of the calculated noise by speech content. The SFVAD system robustly handles instant changes of background noise level and has dramatically lower false detection rates.