Abstract:
An audio correction apparatus and an audio correction method. The audio correction method includes: receiving audio data, which may be input by a user and/or an instrument uttering sounds; detecting onset information by analyzing harmonic components of the received audio data; detecting pitch information of the received audio data based on the detected onset information; comparing the audio data with reference audio data and aligning the two based on the detected onset information and the detected pitch information; and correcting the aligned audio data to match the reference audio data.
Abstract:
An electronic device and method of recognizing an audio scene are provided. The method of recognizing an audio scene includes: separating, according to a predetermined criterion, an input audio signal into channels; recognizing, according to each of the separated channels, at least one audio scene from the input audio signal by using a plurality of neural networks trained to recognize an audio scene; and determining, based on a result of the recognizing of the at least one audio scene, at least one audio scene included in audio content by using a neural network trained to combine audio scene recognition results for respective channels, wherein the plurality of neural networks includes: a first neural network trained to recognize the audio scene based on a time-frequency shape of an audio signal, a second neural network trained to recognize the audio scene based on a shape of a spectral envelope of the audio signal, and a third neural network trained to recognize the audio scene based on a feature vector extracted from the audio signal.
Abstract:
Provided are an electronic apparatus and a controlling method thereof. The electronic apparatus includes an inputter and a processor configured to, based on receiving an audio signal through the inputter, obtain a speech intelligibility for the audio signal, and modify the audio signal so that the speech intelligibility becomes a target intelligibility that is set based on scene information regarding a type of audio included in the audio signal, and the type of audio includes at least one of a sound effect, shouting, music, or a speech.