摘要:
Methods for detecting whether a rendered version of a specified seamless connection (“SSC”) at a connection point between two audio segment sequences results in an audible discontinuity, and methods for analyzing at least one SSC between audio segment sequences to determine whether a renderable version of each SSC would have an audible discontinuity at the connection point when rendered, and in appropriate cases, for a SSC having a renderable version which is determined to have an audible discontinuity when rendered, correcting at least one audio segment of at least one segment sequence to be connected in accordance with the SSC in an effort to ensure that rendering of the SSC will result in seamless connection without an audible discontinuity. Other aspects are editing systems configured to implement any of the methods, and storage media and rendering systems which store audio data generated in accordance with any of the methods.
摘要:
A method of decomposing a matrix of dimension L-by-N, where L is less than or equal to N, into a sequence of N-by-N unit primitive matrices and a permutation matrix comprising a sequence that is the product of the primitive matrices and the permutation matrix, containing L rows that are substantially close to the provided L-by-N matrix, where the choice of the permutation matrix and the indices of the non-trivial rows in the primitive matrices are chosen to limit the coefficient values in the primitive matrices.
摘要:
Methods for generating encoded audio programs indicative of N channels of discontinuity-corrected, encoded audio content, including by applying discontinuity correction values to multi-channel audio content, and for rendering such a program (e.g., to generate a discontinuity-corrected M-channel mix of content indicated by the program). Other aspects are systems or devices (e.g., encoders or decoders, or rendering systems) configured to implement any of the methods.
摘要:
A method of encoding adaptive audio, comprising receiving N objects and associated spatial metadata that describes the continuing motion of these objects, and partitioning the audio into segments based on the spatial metadata. The method encodes adaptive audio having objects and channel beds by capturing a continuing motion of a number N objects in a time-varying matrix trajectory comprising a sequence of matrices, coding coefficients of the time-varying matrix trajectory in spatial metadata to be transmitted via a high-definition audio format for rendering the adaptive audio through a number M output channels, and segmenting the sequence of matrices into a plurality of sub-segments based on the spatial metadata, wherein the plurality of sub segments are configured to facilitate coding of one or more characteristics of the adaptive audio.
摘要:
Audio processing methods may involve receiving audio data corresponding to a plurality of audio channels. The audio data may include a frequency domain representation corresponding to filterbank coefficients of an audio encoding or processing system. A decorrelation process may be performed with the same filterbank coefficients used by the audio encoding or processing system. The decorrelation process may be performed without converting coefficients of the frequency domain representation to another frequency domain or time domain representation. The decorrelation process may involve selective or signal-adaptive decorrelation of specific channels and/or specific frequency bands. The decorrelation process may involve applying a decorrelation filter to a portion of the received audio data to produce filtered audio data. The decorrelation process may involve using a non-hierarchal mixer to combine a direct portion of the received audio data with the filtered audio data according to spatial parameters.
摘要:
Methods which uses interpolated primitive matrices to decode encoded audio to recover (losslessly) content of a multichannel audio program and/or to recover at least one downmix of such content, and encoding methods for generating such encoded audio. In some embodiments, a decoder performs interpolation on a set of seed primitive matrices to determine interpolated matrices for use in rendering channels of the program. Other aspects are a system or device configured to implement any embodiment of the method.
摘要:
A first vector quantization process may be applied to two or more parameter values along a first dimension of the N-dimensional parameter set to produce a first set of quantized values. Two or more parameter prediction values may be calculated for a second dimension of the N-dimensional parameter set based, at least in part, on one or more values of the first set of quantized values. Prediction residual values may be calculated based, at least in part, on the parameter prediction values. A second vector quantization process may be applied to the prediction residual values to produce a second set of quantized values. These processes may be extended to any number of dimensions. Corresponding inverse vector quantization processes may be performed.
摘要:
Audio characteristics of audio data corresponding to a plurality of audio channels may be determined. The audio characteristics may include spatial parameter data. Decorrelation filtering processes for the audio data may be based, at least in part, on the audio characteristics. The decorrelation filtering processes may cause a specific inter-decorrelation signal coherence (“IDC”) between channel-specific decorrelation signals for at least one pair of channels. The channel-specific decorrelation signals may be received and/or determined. Inter-channel coherence (“ICC”) between a plurality of audio channel pairs may be controlled. Controlling ICC may involve at receiving an ICC value and/or determining an ICC value based, at least partially, on the spatial parameter data. A set of IDC values may be based, at least partially, on the set of ICC values. A set of channel-specific decorrelation signals, corresponding with the set of IDC values, may be synthesized by performing operations on the filtered audio data.
摘要:
Some audio processing methods may involve receiving audio data corresponding to a plurality of audio channels and determining audio characteristics of the audio data, which may include transient information. An amount of decorrelation for the audio data may be based, at least in part, on the audio characteristics. If a definite transient event is determined, a decorrelation process may be temporarily halted or slowed. Determining transient information may involve evaluating the likelihood and/or the severity of a transient event. In some implementations, determining transient information may involve evaluating a temporal power variation in the audio data. Explicit transient information may or may not be received with the audio data, depending on the implementation. Explicit transient information may include a transient control value corresponding to a definite transient event, a definite non-transient event or an intermediate transient control value.
摘要:
Received audio data may include a first set of frequency coefficients and a second set of frequency coefficients. Spatial parameters for at least part of the second set of frequency coefficients may be estimated, based at least in part on the first set of frequency coefficients. The estimated spatial parameters may be applied to the second set of frequency coefficients to generate a modified second set of frequency coefficients. The first set of frequency coefficients may correspond to a first frequency range (for example, an individual channel frequency range) and the second set of frequency coefficients may correspond to a second frequency range (for example, a coupled channel frequency range). Combined frequency coefficients of a composite coupling channel may be based on frequency coefficients of two or more channels. Cross-correlation coefficients, between frequency coefficients of a first channel and the combined frequency coefficients, may be computed.