SOUND SOURCE LOCALIZATION MODEL TRAINING AND SOUND SOURCE LOCALIZATION METHOD, AND APPARATUS

    公开(公告)号:US20230077816A1

    公开(公告)日:2023-03-16

    申请号:US17658513

    申请日:2022-04-08

    Abstract: The present disclosure provides a method for training sound source localization model and a sound source localization method, and relates to the field of artificial intelligence technologies such as voice processing and deep learning. The method for training sound source localization model method includes: obtaining a sample audio according to an audio signal including a wake-up word; extracting an audio feature of at least one audio frame in the sample audio, and marking a direction label and a mask label of the at least one audio frame; and training a neural network model by using the audio feature of the at least one audio frame and the direction label and the mask label of the at least one audio frame, to obtain a sound source localization model. The sound source localization method includes: acquiring a to-be-processed audio signal, and extracting an audio feature of each audio frame in the to-be-processed audio signal; inputting the audio feature of each audio frame into a sound source localization model, to obtain sound source direction information outputted by the sound source localization model for each audio frame; determining a wake-up word endpoint frame in the to-be-processed audio signal; and obtaining a sound source direction of the to-be-processed audio signal according to sound source direction information corresponding to the wake-up word endpoint frame.

Patent Agency Ranking