-
公开(公告)号:US20230199421A1
公开(公告)日:2023-06-22
申请号:US17893907
申请日:2022-08-23
Inventor: Rui Qing , Jianqiang WEI
CPC classification number: H04S7/303 , H04S7/305 , H04S1/007 , H04S2420/01 , H04S2400/15 , H04S2400/11
Abstract: Provided are an audio processing method and apparatus, and a storage medium, which relate to the technical field of artificial intelligence and, in particular, to the speech technical field. The specific implementation solution is as follows. In response to receiving to-be-processed audio, a target sounding direction corresponding to the to-be-processed audio is determined; direction sense reconstruction is performed on the to-be-processed audio according to a direction sense reconstruction filter corresponding to the target sounding direction to obtain target audio; and the target audio is output.
-
公开(公告)号:US20230186943A1
公开(公告)日:2023-06-15
申请号:US17893895
申请日:2022-08-23
Inventor: Guochang ZHANG , Libiao YU , Jianqiang WEI
CPC classification number: G10L25/78 , G10L25/93 , G10L2025/937
Abstract: Provided are a voice activity detection method and apparatus, an electronic device and a storage medium, which relate to the technical field of voice processing, for example, to the technical field of artificial intelligence and deep learning. The specific implementation solution is described below. A first audio signal is acquired, and a frequency domain feature of the first audio signal is extracted; and the frequency domain feature of the first audio signal is input into a voice activity detection model, and a voice presence detection result output by the voice activity detection model is obtained, where the voice activity detection model is configured to detect whether voice is present in the first audio signal.
-
3.
公开(公告)号:US20230186933A1
公开(公告)日:2023-06-15
申请号:US18077307
申请日:2022-12-08
Inventor: Chunliang WANG , Jianqiang WEI , Guochang ZHANG , Libiao YU
IPC: G10L21/0216 , G10L25/18
CPC classification number: G10L21/0216 , G10L25/18
Abstract: Provided are a voice noise reduction method, an electronic device, and a non-transitory computer-readable storage medium. The specific implementation scheme includes determining a to-be-denoised voice spectrum of a to-be-denoised voice signal; performing feature extraction on the to-be-denoised voice spectrum to obtain a local voice spectral feature of the to-be-denoised voice spectrum; determining a global voice spectral feature of the to-be-denoised voice spectrum according to the local voice spectral feature of the to-be-denoised voice spectrum; and determining a masking matrix of an original voice signal in the to-be-denoised voice signal according to the local voice spectral feature and the global voice spectral feature, and determining the original voice signal according to the to-be-denoised voice spectrum and the masking matrix.
-
公开(公告)号:US20230186930A1
公开(公告)日:2023-06-15
申请号:US17890638
申请日:2022-08-18
Inventor: Guangzheng LI , Guochang ZHANG , Libiao YU , Jianqiang WEI
IPC: G10L21/0208 , G10L25/30
CPC classification number: G10L21/0208 , G10L25/30
Abstract: A speech enhancement method includes steps as follows. Subband decomposition processing is performed on at least two paths of target speech to obtain amplitude spectrums and phase spectrums of the at least two paths of target speech, where the at least two paths of target speech include: target mixed speech and target interference speech; a prediction probability of the target mixed speech including target clean speech in a feature domain is determined according to the amplitude spectrums of the at least two paths of target speech; and subband synthesis processing is performed according to the prediction probability and the amplitude spectrums and the phase spectrums of the at least two paths of target speech to obtain the target clean speech in the target mixed speech.
-
-
-