Abstract:
In some embodiments, a method (typically performed by a game console) for generating an object based audio program indicative of game audio content (audio content pertaining to play of or events in a game, and optionally also other information regarding the game), and including at least one audio object channel and at least one speaker channel. In other embodiments, a game console configured to generate such an object based audio program. Some embodiments implement object clustering in which audio content of input objects is mixed to generate at least one clustered audio object, or audio content of at least one input object is mixed with speaker channel audio. In response to the program, a spatial rendering system (e.g., external to the game console) may operate with knowledge of playback speaker configuration to generate speaker feeds indicative of a spatial mix of the program's speaker and object channel content.
Abstract:
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Abstract:
Embodiments are described for a method and system of rendering and playing back spatial audio content using a channel-based format. Spatial audio content that is played back through legacy channel-based equipment is transformed into the appropriate channel-based format resulting in the loss of certain positional information within the audio objects and positional metadata comprising the spatial audio content. To retain this information for use in spatial audio equipment even after the audio content is rendered as channel-based audio, certain metadata generated by the spatial audio processor is incorporated into the channel-based data. The channel-based audio can then be sent to a channel-based audio decoder or a spatial audio decoder. The spatial audio decoder processes the metadata to recover at least some positional information that was lost during the down-mix operation by upmixing the channel-based audio content back to the spatial audio content for optimal playback in a spatial audio environment.