Abstract:
Systems and methods are disclosed for networked audio automixing using array microphones and an aggregator unit that participate in making a common gating decision to determine which channels to gate on and off. Through the use of such a network of array microphones having the capability to generate submix audio signals and reduced bandwidth metrics, as well as AEC processing capability, array microphone lobe selection can be enhanced while maximizing signal -to-noise ratio, increasing intelligibility, and increasing user satisfaction.
Abstract:
A decorrelator comprises a plurality of delay units, wherein each delay unit is configured for receiving a part of a frequency representation being based on an audio signal, wherein each delay unit is configured for delaying the received part to provide a delayed part. The decorrelator comprises an envelope shaper configured for receiving and combining signals being based on the delayed parts of the frequency representation. The envelope shaper receives the frequency representation of the audio signal and is configured for adjusting an energy of the delayed parts in respect of the frequency representation of the audio signal. The envelope shaper is configured for providing a combined shaped frequency representation. Transient signal portions are handled by an adapted operation of the decorrelator.
Abstract:
The disclosed computer-implemented method for smoothing audio gaps using adaptive metadata identifies an initial audio segment and a subsequent audio segment that follows the initial audio segment. The method accesses a first set of metadata that corresponds to a last audio frame of the initial audio segment and accesses a second set of metadata that corresponds to the first audio frame of the subsequent audio segment. The first and second sets of metadata include audio characteristic information for the two audio segments. The method then generates a new set of metadata that is based on both sets of audio characteristics. The method further inserts a new audio frame between the last audio frame of the initial audio segment and the first audio frame of the subsequent audio segment and applies the new set of metadata to the new audio frame. Various other methods, systems, and computer-readable media are also disclosed.
Abstract:
A method of enhancing dialog intelligibility in an audio signal, comprising determining a speech confidence score that the audio content includes speech content, determining a music confidence score that the audio content includes music correlated content, in response to the speech confidence score, and applying a user selected gain of selected frequency bands of the audio signal to obtain a dialogue enhanced audio signal. The user selected gain is smoothed by an adaptive smoothing algorithm, an impact of past frames in said smoothing algorithm being determined by a smoothing factor, the smoothing factor being calculated in response to the music confidence score, and having a relatively higher value for content having a relatively higher music confidence score and a relatively lower value for speech content having a relatively lower music confidence score, so as to increase the impact of past frames on the dialogue enhancement of music correlated content.
Abstract:
전자 장치가 개시된다. 이 외에도 명세서를 통해 파악되는 다양한 실시 예가 가능하다. 전자 장치는, 음성 신호와 잡음 신호를 포함하는 복수의 입력 신호들을 수신하는 복수의 입력 장치들, 및 상기 입력 장치들과 전기적으로 연결되는 프로세서를 포함하고, 상기 프로세서는, 상기 복수의 입력 신호들에 대한 신호 대 잡음 비(signal to ratio, SNR) 값을 주파수 대역 별로 결정하고, 상기 SNR 값이 지정된 임계 값 이상인 제1 주파수 대역에서 상기 복수의 입력 신호들의 주파수 대비 위상의 변화를 나타내는 제1 파라미터를 결정하고, 상기 제1 파라미터에 기반하여, 상기 SNR 값이 상기 임계 값 미만인 제2 주파수 대역에서 상기 복수의 입력 신호들의 주파수 대비 위상의 변화를 나타내는 제2 파라미터를 결정하고, 상기 제1 파라미터 및 상기 제2 파라미터에 기반하여 상기 복수의 입력 신호들에 대한 빔포밍을 수행하도록 설정될 수 있다.
Abstract:
Methods and systems for detecting the presence and frequency of clipping in an audio signal are provided. A clipping detection algorithm detects the presence of hard and soft clipping using histograms with intervals of samples, rather than attempting to identify the clipping value. Therefore, it is not essential to the algorithm that there be a large number of bins. Furthermore, the bins may be non-uniformly distributed since the number of samples belonging to lower amplitudes is of little importance. The detection algorithm is also configured to determine the severity and/or perceptual effect of any clipping found to be present in the signal by calculating the ratio of clipped samples to non-clipped samples. Temporal information on the occurrence of clipping in the signal is also used to evaluate perceptual effect.
Abstract:
Method for measuring level of speech determined by an audio signal in a manner which corrects for and reduces the effect of modification of the signal by the addition of noise thereto and/or amplitude compression thereof, and a system configured to perform any embodiment of the method. In some embodiments, the method includes steps of generating frequency banded, frequency-domain data indicative of an input speech signal, determining from the data a Gaussian parametric spectral model of the speech signal, and determining from the parametric spectral model an estimated mean speech level and a standard deviation value for each frequency band of the data; and generating speech level data indicative of a bias corrected mean speech level for each frequency band, including using at least one correction value to correct the estimated mean speech level for the frequency band, where each correction value has been predetermined using a reference speech model.
Abstract:
A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.