Abstract:
A method for compressing a HOA signal being an input HOA representation with input time frames (C(k)) of HOA coefficient sequences comprises spatial HOA encoding of the input time frames and subsequent perceptual encoding and source encoding. Each input time frame is decomposed (802) into a frame of predominant sound signals (XPS(k−1)) and a frame of an ambient HOA component ({tilde over (C)}AMB(k−1)). The ambient HOA component ({tilde over (C)}AMB(k−1)) comprises, in a layered mode, first HOA coefficient sequences of the input HOA representation (cn(k−1)) in lower positions and second HOA coefficient sequences (cAMB,n(k−1)) in remaining higher positions. The second HOA coefficient sequences are part of an HOA representation of a residual between the input HOA representation and the HOA representation of the predominant sound signals.
Abstract:
A method and apparatus for decompressing a Higher Order Ambisonics (HOA) signal representation is disclosed. The apparatus includes an input interface that receives an encoded directional signal and an encoded ambient signal and an audio decoder that perceptually decodes the encoded directional signal and encoded ambient signal to produce a decoded directional signal and a decoded ambient signal, respectively. The apparatus further includes an extractor for obtaining side information related to the directional signal and an inverse transformer for converting the decoded ambient signal from a spatial domain to an HOA domain representation of the ambient signal. The apparatus also includes a synthesizer for recomposing a Higher Order Ambisonics (HOA) signal from the HOA domain representation of the ambient signal and the decoded directional signal. The side information includes a direction of the directional signal selected from a set of uniformly spaced directions.
Abstract:
Methods and apparatus for decoding a compressed Higher Order Ambisonics (HOA) representation of a sound or soundfield. The method may include receiving a bit stream containing the compressed HOA representation and decoding, based on a determination that there are multiple layers, the compressed HOA representation from the bitstream to obtain a sequence of decoded HOA representations. A first subset of the sequence of decoded HOA representations is determined based only on corresponding ambient HOA components. A second subset of the sequence of decoded HOA representations is determined based on corresponding ambient HOA components and corresponding predominant sound components. For a frame k, the sequence of decoded HOA representations are represented at least in part by
c ^
~
n
(
k - 1
)
=
{
c ^
AMB , n
(
k - 1
)
for
n
in
the
first
subset
c ^
n
(
k - 1
)
=
c ^
PS , n
(
k - 1
)
+
c ^
AMB , n
(
k - 1
)
,
for
n
in
the
second
subset
where ĉAMB,n(k−1) corresponds to the corresponding ambient HOA components and ĉPS,n(k−1) corresponds to the corresponding predominant sound components.
Abstract:
The invention improves HOA sound field representation compression and decompression. A decoder decodes compressed dominant directional signals and compressed residual component signals so as to provide decompressed dominant directional signals and decompressed time domain signals representing a residual HOA component in a spatial domain. A re-correlator re-correlates the decompressed time domain signals to obtain a corresponding reduced-order residual HOA component. A processor determines a decompressed residual HOA component based on the corresponding reduced-order residual HOA component, and determines predicted directional signals based on at least a parameter. The processor is further configured to determine an HOA sound field representation based on the decompressed dominant directional signals, the predicted directional signals, and the decompressed residual HOA component.
Abstract:
A method for compressing a HOA signal being an input HOA representation with input time frames (C(k)) of HOA coefficient sequences comprises spatial HOA encoding of the input time frames and subsequent perceptual encoding and source encoding. Each input time frame is decomposed (802) into a frame of predominant sound signals (XPS(k−1)) and a frame of an ambient HOA component ({tilde over (C)}AMB(k−1)). The ambient HOA component ({tilde over (C)}AMB(k−1)) comprises, in a layered mode, first HOA coefficient sequences of the input HOA representation (cn(k−1)) in lower positions and second HOA coefficient sequences (cAMB,n(k−1)) in remaining higher positions. The second HOA coefficient sequences are part of an HOA representation of a residual between the input HOA representation and the HOA representation of the predominant sound signals.
Abstract:
When compressing an HOA data frame representation, a gain control (15, 151) is applied for each channel signal before it is perceptually encoded (16). The gain values are transferred in a differential manner as side information. However, for starting decoding of such streamed compressed HOA data frame representation absolute gain values are required, which should be coded with a minimum number of bits. For determining such lowest integer number (βe) of bits the HOA data frame representation (C(k)) is rendered in spatial domain to virtual loudspeaker signals lying on a unit sphere, followed by normalisation of the HOA data frame representation (C(k)). Then the lowest integer number of bits is set to βe=┌log2(┌log2(√{square root over (KMAX)}·O)┐+1)┐.
Abstract:
The invention improves HOA sound field representation compression. The HOA representation is analysed for the presence of dominant sound sources and their directions are estimated. Then the HOA representation is decomposed into a number of dominant directional signals and a residual component. This residual component is transformed into the discrete spatial domain in order to obtain general plane wave functions at uniform sampling directions, which are predicted from the dominant directional signals. Finally, the prediction error is transformed back to the HOA domain and represents the residual ambient HOA component for which an order reduction is performed, followed by perceptual encoding of the dominant directional signals and the residual component.
Abstract:
Spherical microphone arrays capture a three-dimensional sound field (P(Ωc, t)) for generating an Ambisonics representation (Anm(t)), where the pressure distribution on the surface of the sphere is sampled by the capsules of the array. The impact of the microphones on the captured sound field is removed using the inverse microphone transfer function. The equalisation of the transfer function of the microphone array is a big problem because the reciprocal of the transfer function causes high gains for small values in the transfer function and these small values are affected by transducer noise. The invention minimises that noise by using a Wiener filter processing in the frequency domain, which processing is automatically controlled per wave number by the signal-to-noise ratio of the microphone array.
Abstract:
The present invention relates to methods and apparatus for encoding an HOA signal representation (c(t)) of a sound field having an order of N and a number O=(N+1)2 of coefficient sequences to a mezzanine HOA signal representation (wMEZZ(t)). The present invention further relates to methods and apparatus for decoding a reconstructed HOA signal representation from the mezzanine HOA signal representation.
Abstract:
Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore, compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional coefficient sequences of the ambient HOA component, depending on what will result in optimum perceptual quality. This processing can change on a frame-by-frame basis.