Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Jianqiang WEI"

1.

发明公开
AUDIO PROCESSING METHOD AND APPARATUS, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230199421A1

公开(公告)日：2023-06-22

申请号：US17893907

申请日：2022-08-23

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Rui Qing , Jianqiang WEI

IPC: H04S7/00 , H04S1/00

CPC classification number: H04S7/303 , H04S7/305 , H04S1/007 , H04S2420/01 , H04S2400/15 , H04S2400/11

Abstract: Provided are an audio processing method and apparatus, and a storage medium, which relate to the technical field of artificial intelligence and, in particular, to the speech technical field. The specific implementation solution is as follows. In response to receiving to-be-processed audio, a target sounding direction corresponding to the to-be-processed audio is determined; direction sense reconstruction is performed on the to-be-processed audio according to a direction sense reconstruction filter corresponding to the target sounding direction to obtain target audio; and the target audio is output.

2.

发明公开
VOICE ACTIVITY DETECTION METHOD AND APPARATUS, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230186943A1

公开(公告)日：2023-06-15

申请号：US17893895

申请日：2022-08-23

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Guochang ZHANG , Libiao YU , Jianqiang WEI

IPC: G10L25/78 , G10L25/93

CPC classification number: G10L25/78 , G10L25/93 , G10L2025/937

Abstract: Provided are a voice activity detection method and apparatus, an electronic device and a storage medium, which relate to the technical field of voice processing, for example, to the technical field of artificial intelligence and deep learning. The specific implementation solution is described below. A first audio signal is acquired, and a frequency domain feature of the first audio signal is extracted; and the frequency domain feature of the first audio signal is input into a voice activity detection model, and a voice presence detection result output by the voice activity detection model is obtained, where the voice activity detection model is configured to detect whether voice is present in the first audio signal.

3.

发明公开
VOICE NOISE REDUCTION METHOD, ELECTRONIC DEVICE, NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM 审中-公开

公开(公告)号：US20230186933A1

公开(公告)日：2023-06-15

申请号：US18077307

申请日：2022-12-08

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Chunliang WANG , Jianqiang WEI , Guochang ZHANG , Libiao YU

IPC: G10L21/0216 , G10L25/18

CPC classification number: G10L21/0216 , G10L25/18

Abstract: Provided are a voice noise reduction method, an electronic device, and a non-transitory computer-readable storage medium. The specific implementation scheme includes determining a to-be-denoised voice spectrum of a to-be-denoised voice signal; performing feature extraction on the to-be-denoised voice spectrum to obtain a local voice spectral feature of the to-be-denoised voice spectrum; determining a global voice spectral feature of the to-be-denoised voice spectrum according to the local voice spectral feature of the to-be-denoised voice spectrum; and determining a masking matrix of an original voice signal in the to-be-denoised voice signal according to the local voice spectral feature and the global voice spectral feature, and determining the original voice signal according to the to-be-denoised voice spectrum and the masking matrix.

4.

发明公开
SPEECH ENHANCEMENT METHOD AND APPARATUS, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230186930A1

公开(公告)日：2023-06-15

申请号：US17890638

申请日：2022-08-18

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Guangzheng LI , Guochang ZHANG , Libiao YU , Jianqiang WEI

IPC: G10L21/0208 , G10L25/30

CPC classification number: G10L21/0208 , G10L25/30

Abstract: A speech enhancement method includes steps as follows. Subband decomposition processing is performed on at least two paths of target speech to obtain amplitude spectrums and phase spectrums of the at least two paths of target speech, where the at least two paths of target speech include: target mixed speech and target interference speech; a prediction probability of the target mixed speech including target clean speech in a feature domain is determined according to the amplitude spectrums of the at least two paths of target speech; and subband synthesis processing is performed according to the prediction probability and the amplitude spectrums and the phase spectrums of the at least two paths of target speech to obtain the target clean speech in the target mixed speech.

Patent Agency Ranking