-
公开(公告)号:US20240203443A1
公开(公告)日:2024-06-20
申请号:US18068187
申请日:2022-12-19
Applicant: Nvidia Corporation
Inventor: Suchitra Mandar JJoshi , Mihir Manohar Nyayate , Nitin Mahesh Gode
IPC: G10L21/14 , G10L21/0232 , G10L25/30
CPC classification number: G10L21/14 , G10L21/0232 , G10L25/30
Abstract: Systems and methods described relate to the enhancement of audio, such as through machine learning-based audio super-resolution processing. An efficient resampling approach can be used for audio data received at a lower frequency than is needed for an audio enhancement neural network. This audio data can be converted into the frequency domain using, and once in the frequency domain (e.g., represented using a spectrogram) this lower frequency data can be resampled to provide a frequency-based representation that is at the target input resolution for the neural network. To keep this resampling process lightweight, the upper frequency bands can be padded with zero value entries (or other such padding values). This resampled, higher frequency spectrogram can be provided as input to the neural network, which can perform an enhancement operation such as audio upsampling or super-resolution.
-
2.
公开(公告)号:US20240304203A1
公开(公告)日:2024-09-12
申请号:US18117717
申请日:2023-03-06
Applicant: NVIDIA Corporation
Inventor: Suchitra Mandar Joshi , Mihir Manohar Nyayate , Ambrish Dantrey
IPC: G10L21/0232 , G10L25/51 , G10L25/78
CPC classification number: G10L21/0232 , G10L25/51 , G10L25/78
Abstract: In various examples, a noise reduction may be performed based at least on determining that audio data encoding sound includes undesirable sound or lacks desirable sound. A frequency is determined for audio data based at least on value(s) associated with frequency(ies) within a frequency band and used to determine that sound encoded in the audio data includes undesirable sound or lacks desirable sound.
-