摘要:
An apparatus for analyzing a sound signal is based on an ear model for deriving, for a number of inner hair cells, an estimate for a time-varying concentration of transmitter substance inside a cleft between an inner hair cell and an associated auditory nerve from the sound signal so that an estimated inner hair cell cleft contents map over time is obtained. This map is analyzed by means of a pitch analyzer to obtain a pitch line over time, the pitch line indicating a pitch of the sound signal for respective time instants. A rhythm analyzer is operative for analyzing envelopes of estimates for selected inner hair cells, the inner hair cells being selected in accordance with the pitch line, so that segmentation instants are obtained, wherein a segmentation instant indicates an end of the preceding note or a start of a succeeding note. Thus, a human-related and reliable sound signal analysis can be obtained.
摘要:
An apparatus for extracting a direct and/or ambience signal from a downmix signal and spatial parametric information, the downmix signal and the spatial parametric information representing a multi-channel audio signal having more channels than the downmix signal, wherein the spatial parametric information has inter-channel relations of the multi-channel audio signal, is described. The apparatus has a direct/ambience estimator and a direct/ambience extractor. The direct/ambience estimator is configured for estimating a level information of a direct portion and/or an ambient portion of the multi-channel audio signal based on the spatial parametric information. The direct/ambience extractor is configured for extracting a direct signal portion and/or an ambient signal portion from the downmix signal based on the estimated level information of the direct portion or the ambient portion.
摘要:
An audio encoder, an audio decoder or an audio processor includes a filter for generating a filtered audio signal, the filter having a variable warping characteristic, the characteristic being controllable in response to a time-varying control signal, the control signal indicating a small or no warping characteristic or a comparatively high warping characteristic. Furthermore, a controller is connected for providing the time-varying control signal, which depends on the audio signal. The filtered audio signal can be introduced to an encoding processor having different encoding algorithms, one of which is a coding algorithm adapted to a specific signal pattern. Alternatively, the filter is a post-filter receiving a decoded audio signal.
摘要:
An audio signal decoder for providing an upmix signal representation in dependence on a downmix signal representation and an object-related parametric information includes an object separator configured to decompose the downmix signal representation, to provide a first audio information describing a first set of one or more audio objects of a first audio object type and a second audio information describing a second set of one or more audio objects of a second audio object type, in dependence on the downmix signal representation and using at least a part of the object-related parametric information.
摘要:
A method for decoding a multi-audio-object signal having audio signals of first and second types encoded therein, the multi-audio-object signal having a downmix signal and side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, the method including computing a prediction coefficient matrix C based on the level information; and up-mixing the downmix signal based on the prediction coefficients to obtain a first and/or a second up-mix audio signal approximating the audio signals of the first and second types, respectively, wherein up-mixing yields the first and/or second up-mix signals S1 and S2 from the downmix signal d according to a computation representable by ( S 1 S 2 ) = D - 1 { ( 1 C ) d + H } , with “1” denoting—depending on the number of channels of d—a scalar, or an identity matrix, and D−1 being a matrix uniquely determined by a downmix prescription according to which the audio signals of the first and second types are downmixed into the downmix signal, and which is also included by the side information, and H being a term independent from d.
摘要:
According to the present invention, multiple parametrically encoded audio signals can be efficiently combined using an audio signal generator, which generates an audio output signal by combining the down-mix channels and the associated parameters of the audio signals directly within the parameter domain, i.e. without reconstructing or decoding the individual input audio signals prior to the generation of the audio output signal. This is achieved by direct mixing of the associated down-mix channels of the individual input signals. It is one key feature of the present invention that the combination of the down-mix channels is achieved by simple, computationally inexpensive arithmetic operations.
摘要:
An audio encoder has a common preprocessing stage, an information sink based encoding branch such as spectral domain encoding branch, a information source based encoding branch such as an LPC-domain encoding branch and a switch for switching between these branches at inputs into these branches or outputs of these branches controlled by a decision stage. An audio decoder has a spectral domain decoding branch, an LPC-domain decoding branch, one or more switches for switching between the branches and a common post-processing stage for post-processing a time-domain audio signal for obtaining a post-processed audio signal.
摘要:
An audio encoder, an audio decoder or an audio processor includes a filter for generating a filtered audio signal, the filter having a variable warping characteristic, the characteristic being controllable in response to a time-varying control signal, the control signal indicating a small or no warping characteristic or a comparatively high warping characteristic. Furthermore, a controller is connected for providing the time-varying control signal, which depends on the audio signal. The filtered audio signal can be introduced to an encoding processor having different encoding algorithms, one of which is a coding algorithm adapted to a specific signal pattern. Alternatively, the filter is a post-filter receiving a decoded audio signal.
摘要:
An audio encoder for encoding an audio signal includes an impulse extractor for extracting an impulse-like portion from the audio signal. This impulse-like portion is encoded and forwarded to an output interface. Furthermore, the audio encoder includes a signal encoder which encodes a residual signal derived from the original audio signal so that the impulse-like portion is reduced or eliminated in the residual audio signal. The output interface forwards both, the encoded signals, i.e., the encoded impulse signal and the encoded residual signal for transmission or storage. On the decoder-side, both signal portions are separately decoded and then combined to obtain a decoded audio signal.
摘要:
In order to generate a multi-channel signal having a number of output channels greater than a number of input channels, a mixer is used for upmixing the input signal to form at least a direct channel signal and at least an ambience channel signal. A speech detector is provided for detecting a section of the input signal, the direct channel signal or the ambience channel signal in which speech portions occur. Based on this detection, a signal modifier modifies the input signal or the ambience channel signal in order to attenuate speech portions in the ambience channel signal, whereas such speech portions in the direct channel signal are attenuated to a lesser extent or not at all. A loudspeaker signal outputter then maps the direct channel signals and the ambience channel signals to loudspeaker signals which are associated to a defined reproduction scheme, such as, for example, a 5.1 scheme.