Abstract:
Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during “run time,” during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment.
Abstract:
Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during “run time,” during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment.
Abstract:
The positions of a plurality of speakers at a media consumption site are determined. Audio information in an object-based format is received. Gain adjustment value for a sound content portion in the object-based format may be determined based on the position of the sound content portion and the positions of the plurality of speakers. Audio information in a ring-based channel format is received. Gain adjustment value for each ring-based channel in a set of ring-based channels may be determined based on the ring to which the ring-based channel belongs and the positions of the speakers at a media consumption site.
Abstract:
Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones. During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environment.
Abstract:
Input audio data, including first microphone audio signals and second microphone audio signals output by a pair of coincident, vertically-stacked directional microphones, may be received. An azimuthal angle corresponding to a sound source location may be determined, based at least in part on an intensity difference between the first microphone audio signals and the second microphone audio signals. An elevation angle corresponding to a sound source location may be determined, based at least in part on a temporal difference between the first microphone audio signals and the second microphone audio signals. Output audio data, including at least one audio object corresponding to a sound source, may be generated. The audio object may include audio object signals and associated audio object metadata. The audio object metadata may include at least audio object location data corresponding to the sound source location.
Abstract:
Some disclosed methods may involve receiving audio reproduction data and determining, based on the audio reproduction data, a sound source location at which a sound is to be rendered. A near-field gain and a far-field gain may be based, at least in part, on a sound source distance between the sound source location and a reproduction environment location. Room speaker feed signals may be based, at least in part, on room speaker positions, the sound source location and the far-field gain. Near-field speaker feed signals may be based, at least in part, on the near-field gain, the sound source location and a position of near-field speakers.
Abstract:
An importance metric, based at least in part on an energy metric, may be determined for each of a plurality of received audio objects. Some methods may involve: determining a global importance metric for all of the audio objects, based, at least in part, on a total energy value calculated by summing the energy metric of each of the audio objects; determining an estimated quantization bit depth and a quantization error for each of the audio objects; calculating a total noise metric for all of the audio objects, the total noise metric being based, at least in part, on a total quantization error corresponding with the estimated quantization bit depth; calculating a total signal-to-noise ratio corresponding with the total noise metric and the total energy value; and determining a final quantization bit depth for each of the audio objects by applying a signal-to-noise ratio threshold to the total signal-to-noise ratio.
Abstract:
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Abstract:
Embodiments are described for a method of rendering audio for playback through headphones comprising receiving digital audio content, receiving binaural rendering metadata generated by an authoring tool processing the received digital audio content, receiving playback metadata generated by a playback device, and combining the binaural rendering metadata and playback metadata to optimize playback of the digital audio content through the headphones.
Abstract:
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.