Abstract:
The invention relates to audio signal processing apparatuses and methods, such as an audio signal downmixing apparatus (105) for processing an input audio signal comprising a plurality of input channels (113) into an output audio signal comprising a plurality of primary output channels (123) and at least one auxiliary output channel (125) using a downmix matrix D, wherein the downmix matrix D comprises a primary downmix matrix D u providing the plurality of primary output channels (123) and an auxiliary downmix matrix D w providing the at least one auxiliary output channel (125). The audio signal downmixing apparatus (105) comprises an auxiliary downmix matrix determiner (107) configured to determine the auxiliary downmix matrix D w by computing a plurality of eigenvectors of a covariance matrix COV defined by the plurality of input channels (113) of the input audio signal, determining for at least one eigenvector of the plurality of eigenvectors of the covariance matrix COV a subspace angle between the at least one eigenvector and a vector defined by a column of the primary downmix matrix D u , selecting at least one eigenvector from the plurality of eigenvectors based on the subspace angle and a preset threshold angle Θ ΜΙΝ , and defining at least one column of the auxiliary downmix matrix D w by the at least one selected eigenvector, and a processor (109) configured to process the input audio signal into the output audio signal using the downmix matrix D.
Abstract translation:本发明涉及音频信号处理设备和方法,例如用于将包括多个输入通道(113)的输入音频信号处理成包括多个主输出通道(123)的输出音频信号的音频信号降混装置(105) )和使用下混合矩阵D的至少一个辅助输出通道(125),其中下混合矩阵D包括提供多个主输出通道(123)的主下混合矩阵Du和提供至少一个辅助的辅助下混合矩阵Dw 输出通道(125)。 音频信号下混合装置(105)包括辅助下混合矩阵确定器(107),其被配置为通过计算由输入音频的多个输入声道(113)定义的协方差矩阵COV的多个特征向量来确定辅助下混矩阵Dw 确定所述协方差矩阵COV的所述多个特征向量中的至少一个特征向量,所述至少一个特征向量与由所述主下混矩阵Du的列定义的向量之间的子空间角,从所述多个子特征向量中选择至少一个特征向量 基于所述子空间角和预设阈值角ΘM N N N的特征向量,以及通过所述至少一个所选择的本征向量定义所述辅助下混合矩阵Dw的至少一列;以及处理器(109),被配置为将所述输入音频信号处理为输出音频 信号使用下混矩阵D.
Abstract:
An image processing device (10) performing color balancing of a first image (11a) and at least a second image (12a) is provided. The image processing device (10) comprises a color balancing determination unit (20) and a color balancing calculation unit (21). The color balancing determination unit (20) determines a global gain vector (t) comprising at least two gain factors (â n , â n+1 ) of the first and second images (11a, 12a) by minimizing a cost function based upon reference pixel values of the first and second images (11a, 12a). The first and second image reference pixels depict a shared color scene of the two images. The color balancing calculation unit (21) performs color balancing of the first image (11a) based upon the gain factor (an) of the first image (11a) and to perform a color balancing of the second image (12a) based upon the gain factor (â n+1 ) of the second image (12a).
Abstract:
An apparatus and a method for compressing a set of N binaural room impulse responses, BRIR, wherein each channel of an N channel audio signal (I1, I2,..., IN) is convolved with the corresponding compressed set of N BRIR.
Abstract:
The invention relates to a sound signal processing apparatus (100) for enhancing a sound signal from a target source. The sound signal processing apparatus (100) comprises a plurality of microphones (101a-f), wherein each microphone (101a-f) is configured to receive the sound signal from the target source; an estimator (103) configured to estimate a first power measure on the basis of the sound signal from the target source received by a first microphone (101a-f) of the plurality of microphones (101a-f) and a second power measure on the basis of the sound signal from the target source received by at least a second microphone (101a-f) of the plurality of microphones (101a-f), which is located more distant from the target source than the first microphone (101a-f), wherein the estimator (103) is further configured to determine a gain factor on the basis of a ratio between the second power measure and the first power measure; and an amplifier (105) configured to apply the gain factor to the sound signal from the target source received by the first microphone (101a-f).
Abstract:
The invention relates to a signal processing apparatus (100) for dereverberating a number of input audio signals, the signal processing apparatus (100) comprising a transformer (101) being configured to transform the number of input audio signals into a transformed domain to obtain input transformed coefficients, the input transformed coefficients being arranged to form an input transformed coefficient matrix, a filter coefficient determiner (103) being configured to determine filter coefficients upon the basis of eigenvalues resulting from the decomposition of an input auto-coherence matrix, the filter coefficients being arranged to form a filter coefficient matrix, a filter (105) being configured to convolve input transformed coefficients of the input transformed coefficient matrix by filter coefficients of the filter coefficient matrix to obtain output transformed coefficients, the output transformed coefficients being arranged to form an output transformed coefficient matrix, and an inverse transformer (107) being configured to inversely transform the output transformed coefficient matrix from the transformed domain to obtain a number of output audio signals.
Abstract:
The proposed method for localizing a target sound source from a plurality of sound sources, wherein a multi-channel recording signal of the plurality of sound sources comprises a plurality of microphone channel signals, comprises converting each microphone channel signal into a respective channel spectrogram in a time-frequency domain, blindly separating the channel spectrograms to obtain a plurality of separated source signals, identifying, among the plurality of separated source signals, the separated source signal that best matches a target source model, estimating, based on the identified separated source signal, a binary mask reflecting where the target sound source is active in the channel spectrograms in terms of time and frequency, applying the binary mask on the channel spectrograms to obtain masked channel spectrograms, and localizing the target sound source from the plurality of sound sources based on the masked channel spectrograms.
Abstract:
The invention relates to audio signal processing apparatuses and methods, such as an audio signal downmixing apparatus (105) for processing an input audio signal into an output audio signal, wherein the input audio signal comprises a plurality of input channels (113) recorded at a plurality of spatial positions and the output audio signal comprises a plurality of primary output channels (123). The audio signal downmixing apparatus (105) comprises a downmix matrix determiner (107) configured to determine for each frequency bin j of a plurality of frequency bins a downmix matrix DU with j being an integer in the range from 1 to N, wherein for a given frequency bin j the downmix matrix DU maps a plurality of Fourier coefficients associated with the plurality of input channels (113) of the input audio signal into a plurality of Fourier coefficients of the primary output channels (123) of the output audio signal, wherein for frequency bins with j being smaller than or equal to a cutoff frequency bin k the downmix matrix DU is determined by determining eigenvectors of the discrete Laplace-Beltrami operator L defined by the plurality of spatial positions where the plurality of input channels (113) are recorded, and wherein for frequency bins with j being larger than the cutoff frequency bin k the downmix matrix DU is determined by determining a first subset of eigenvectors of a covariance matrix COV defined by the plurality of input channels (113) of the input audio signal, and a processor (109) configured to process the input audio signal using the downmix matrix DU into the output audio signal.
Abstract:
An audio signal processing device (10) for generating a plurality of output signals for a plurality of loudspeakers from an input audio signal comprises a driving function determining unit (11) adapted to determined driving functions of a plurality of loudspeakers for generating a virtual left binaural signal source and a virtual right binaural signal source based upon a position and a directivity of the virtual left binaural signal source, a position and a directivity of the virtual right binaural signal source and positions of the plurality of loudspeakers. Moreover, it comprises a filtering unit (12) adapted to filter a left binaural signal and a right binaural signal using the driving functions of the plurality of loudspeakers resulting in the plurality of output signals. The left binaural signal and the right binaural signal constitute the input audio signal or are derived there from.