-
公开(公告)号:US11889292B2
公开(公告)日:2024-01-30
申请号:US17581527
申请日:2022-01-21
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Mohammad Taghizadeh , Gil Keren , Shuo Liu , Bjoern Schuller
IPC: H04S7/00 , G10L21/0208 , G10L25/18 , G10L25/30 , G10L21/10
CPC classification number: H04S7/307 , G10L21/0208 , G10L21/10 , G10L25/18 , G10L25/30 , H04S2400/03
Abstract: The disclosure relates to an audio processing apparatus, comprising: a plurality of audio sensors, each audio sensor configured to receive a respective plurality of audio frames of an audio signal from an audio source, wherein the respective plurality of audio frames defines an audio channel of the audio signal; and a processing circuitry configured to: determine a respective feature set having at least one feature for each audio frame of each of the plurality of audio frames, wherein the plurality of features define a three-dimensional feature array; process the three-dimensional feature array using a neural network, wherein the neural network comprises a self-attention layer configured to process a plurality of two-dimensional sub-arrays of the three-dimensional feature array; and generate an output signal on the basis of the plurality of processed two-dimensional sub-arrays. Moreover, the disclosure relates to a corresponding audio processing method.