摘要:
There is disclosed audio synthesizer (300) for generating a synthesis signal (336) from a downmix signal (324, x) having a number of downmix channels, the synthesis signal (336) having a number of synthesis channels, the downmix signal (324, x) being a downmixed version of an original signal (212) having a number of original channels, the audio synthesizer (300) comprising: a first path (610c') including: a first mixing matrix block (600c) configured for synthesizing a first component (336M') of the synthesis signal according to a first mixing matrix (MM) calculated from: a covariance matrix (CYR) associated to the synthesis signal (212); and a covariance matrix (Cx) associated to the downmix signal (324),
a second path (610c) for synthesizing a second component (336R') of the synthesis signal, wherein the second component (336R') is a residual component, the second path (610c) including: a prototype signal block (612c) configured for upmixing the downmix signal (324) from the number of downmix channels to the number of synthesis channels; a decorrelator (614c) configured for decorrelating the upmixed prototype signal (613c); a second mixing matrix block (618c) configured for synthesizing the second component (336R') of the synthesis signal according to a second mixing matrix (MR) from the decorrelated version (615c) of the downmix signal (324), the second mixing matrix (MR) being a residual mixing matrix,
wherein the audio synthesizer (300) is configured to calculate (618c) the second mixing matrix (MR) from: the residual covariance matrix (Cr) provided by the first mixing matrix block(600c); and an estimate of the covariance matrix of the decorrelated prototype signals (Cy ) obtained from the covariance matrix (Cx) associated to the downmix signal (324),
wherein the audio synthesizer (300) further comprises an adder block (620c) for summing the first component (336M') of the synthesis signal with the second component (336R') of the synthesis signal.
摘要:
An apparatus for processing an information signal comprises: a feature extractor (100) for extracting a set of features from the information signal, wherein each feature of the set of features comprises at least two feature components, and wherein the set of features comprises a first subset with the first feature components and a second subset with the second feature components; and a neural network processor (300) comprising: a first neural network (340) for receiving, as an input, the first subset and for outputting a processed first subset; a combiner (350) for combining the processed first subset and the second subset to obtain a combined subset; and a second neural network (360) for receiving, as an input, the combined subset and for outputting a processed combined output, wherein the processed combined output represents a processed information signal, or wherein the apparatus is configured to calculate the processed information signal using the processed combined output, and wherein a complexity of the first neural network (340) is greater than a complexity of the second neural network.
摘要:
An apparatus for encoding directional audio coding parameters comprising diffuseness parameters and direction parameters, comprises: a parameter calculator (100) for calculating the diffuseness parameters with a first time or frequency resolution and for calculating the direction parameters with a second time or frequency resolution; and a quantizer and encoder processor (200) for generating a quantized and encoded representation of the diffuseness parameters and the direction parameters.
摘要:
An apparatus for generating an audio output signal to simulate a recording of a virtual microphone at a configurable virtual position in an environment includes a sound events position estimator and an information computation module. The former is adapted to estimate a sound source position indicating a position of a sound source in the environment, wherein the sound events position estimator is adapted to estimate the sound source position based on first and second direction information provided by first and second real spatial microphones, respectively, located at first and second real microphone positions in the environment, respectively. The information computation module is adapted to generate the audio output signal based on a first recorded audio input signal, on the first real microphone position, on the virtual position of the virtual microphone, and on the sound source position.