Abstract:
In general, techniques are described for transitioning an ambient higher order ambisonic coefficient. A device comprising a memory and a processor may be configured to perform the techniques. The processor may obtain, from a frame of a bitstream of encoded audio data, a bit indicative of a reduced vector. The reduced vector may represent, at least in part, a spatial component of a sound field. The processor may also obtain, from the frame, a bit indicative of a transition of an ambient higher-order ambisonic coefficient. The ambient higher-order ambisonic coefficient may represent, at least in part, an ambient component of the sound field. The reduced vector may include a vector element associated with the ambient higher-order ambisonic coefficient in transition. The memory may be configured to store the frame of the bitstream.
Abstract:
In general, techniques are described for compensating for error in decomposed representations of sound fields. In accordance with the techniques, a device comprising one or more processors may be configured to quantize one or more first vectors representative of one or more components of a sound field, and compensate for error introduced due to the quantization of the one or more first vectors in one or more second vectors that are also representative of the same one or more components of the sound field.
Abstract:
In general, techniques are described for obtaining decomposed versions of spherical harmonic coefficients. In accordance with these techniques, a device comprising one or more processors may be configured to determine a first non-zero set of coefficients of a vector that represent a distinct component of a sound field, the vector having been decomposed from a plurality of spherical harmonic coefficients that describe the sound field.
Abstract:
In general, techniques are described for performing a vector-based synthesis with respect to higher order ambisonic coefficients (or, in other words, spherical harmonic coefficients). A device comprising a processor may be configured to perform the techniques. The processor may perform the vector-based synthesis with respect to spherical harmonic coefficients to generate decomposed representations of the plurality of spherical harmonic coefficients and determine distinct and background directional information from the directional information. The processor may then reduce an order of the directional information associated with the background audio objects to generate transformed background directional information, and apply compensation to increase values of the transformed directional information to preserve an overall energy of the sound field.
Abstract:
In general, techniques are described for specifying spherical harmonic coefficients in a bitstream. A device comprising one or more processors may perform the techniques. The processors may be configured to identify, from the bitstream, a plurality of hierarchical elements describing a sound field that are included in the bitstream. The processors may further be configured to parse the bitstream to determine the identified plurality of hierarchical elements.
Abstract:
In general, techniques are described for transforming spherical harmonic coefficients. A device comprising one or more processors may perform the techniques. The processors may be configured to parse the bitstream to determine transformation information describing how the sound field was transformed to reduce a number of the plurality of hierarchical elements that provide information relevant in describing the sound field. The processors may further be configured to, when reproducing the sound field based on those of the plurality of hierarchical elements that provide information relevant in describing the sound field, transform the sound field based on the transformation information to reverse the transformation performed to reduce the number of the plurality of hierarchical elements.
Abstract:
In general, techniques are described for compensating for loudspeaker positions using hierarchical three-dimensional (3D) audio coding. An apparatus comprising or more processors may perform the techniques. The processors may be configured to perform a first transform that is based on a spherical wave model on a first set of audio channel information for a first geometry of speakers to generate a first hierarchical set of elements that describes a sound field. The processors may further be configured to perform a second transform in a frequency domain on the first hierarchical set of elements to generate a second set of audio channel information for a second geometry of speakers.
Abstract:
Systems, methods, and apparatus for backward-compatible coding of a set of basis function coefficients that describe a sound field are presented.
Abstract:
In general, techniques are described for compressing decomposed representations of a sound field. A device comprising a memory and processing circuitry may be configured to perform the techniques. The memory may be configured to store a bitstream representative of scene-based audio data, the scene-based audio data comprising ambisonic coefficients representative of a soundfield. The processing circuitry may be configured to process the bitstream to extract foreground components and corresponding foreground directional information, dequantize the corresponding foreground directional information to obtain corresponding dequantized directional information, and obtain, based on the foreground components and the corresponding dequantized foreground directional information, a reconstructed version of the scene-based audio data.
Abstract:
An example audio decoding device includes a memory configured to store at least a portion of a coded audio bitstream; and one or more processors configured to: decode, based on the coded audio bitstream, a representation of a soundfield; decode, based on the coded audio bitstream, a syntax element indicating a selection of either a head-related transfer function (HRTF) or a binaural room impulse response (BRIR); and render, using the selected HRTF or BRIR, speaker feeds from the soundfield.