EFFICIENT FREQUENCY-BASED AUDIO RESAMPLING FOR USING NEURAL NETWORKS

    公开(公告)号:US20240203443A1

    公开(公告)日:2024-06-20

    申请号:US18068187

    申请日:2022-12-19

    CPC classification number: G10L21/14 G10L21/0232 G10L25/30

    Abstract: Systems and methods described relate to the enhancement of audio, such as through machine learning-based audio super-resolution processing. An efficient resampling approach can be used for audio data received at a lower frequency than is needed for an audio enhancement neural network. This audio data can be converted into the frequency domain using, and once in the frequency domain (e.g., represented using a spectrogram) this lower frequency data can be resampled to provide a frequency-based representation that is at the target input resolution for the neural network. To keep this resampling process lightweight, the upper frequency bands can be padded with zero value entries (or other such padding values). This resampled, higher frequency spectrogram can be provided as input to the neural network, which can perform an enhancement operation such as audio upsampling or super-resolution.

Patent Agency Ranking