Abstract:
The methods and apparatus described herein optimally represent full 3D audio mixes (e.g., azimuth, elevation, and depth) as "sound scenes" in which the decoding process facilitates head tracking. Sound scene rendering can be modified for the listener's orientation (e.g., yaw, pitch, roll) and 3D position (e.g., x, y, z). This provides the ability to treat sound scene source positions as 3D positions instead of being restricted to positions relative to the listener. The systems and methods discussed herein can fully represent such scenes in any number of audio channels to provide compatibility with transmission through existing audio codecs such as DTS HD, yet carry substantially more information (e.g., depth, height) than a 7.1 channel mix.
Abstract:
Apparatus (103) comprising a processor configured to: receive a spatial audio signal associated with a microphone array (211) configured to provide spatial audio capture and at least one additional audio signal associated with an additional microphone (111), the at least one additional microphone signal having been delayed by a variable delay determined such that the audio signals are time aligned; receive a relative position between a first position associated with the microphone array (211) and a second position (111 (0), 111 (t)) associated with the additional microphone (111); generate at least two output audio channel signals by processing and mixing the spatial audio signal and the at least one additional audio signal based on the relative position between the first position and the second position such that the at least two output audio channel signals present an augmented audio scene.
Abstract:
A sound processing system, method and program product for estimating parameters from binaural audio data. A system is provided having: a system for inputting binaural audio; and a binaural signal analyzer (BICAM) that: performs autocorrelation on both the first channel and second channel to generate a pair of autocorrelation functions; performs a first layer cross-correlation between the first channel and second channel to generate a first layer cross-correlation function; removes the center peak from the first layer cross-correlation function and a selected autocorrelation function to create a modified pair; performs a second layer cross-correlation between the modified pair to determine a temporal mismatch; generates a resulting function by replacing the first layer cross correlation function with the selected autocorrelation function using the temporal mismatch; and utilizes the resulting function to determine ITD parameters and interaural level difference ILD parameters of the direct sound components and reflected sound components.
Abstract:
La presente invención describe un método y sistema de tamaño reducido de grabación binaural, capaz de grabar sonido y decodifícarlo en un formato tridimensional haciendo disponible su reproducción en tres dimensiones utilizando dispositivos de reproducción tipo audífonos o audífonos convencionales, en donde dicho método y sistema ofrecen una solución de grabación portátil de dimensiones preferenciales milimétricas, que puede incorporarse a dispositivos de grabación profesionales o domésticos. Adicionalmente, la presente invención describe un programa computacional destinado a la grabación binaural, así como un procedimiento de fabricación del sistema de grabación de la invención.
Abstract:
Embodiments of the invention relate generally to electrical and electronic hardware, computer software, wired and wireless network communications, and wearable/mobile computing devices configured to facilitate production and/or reproduction of spatial audio and/or one or more audio spaces. More specifically, disclosed are systems, components and methods to acoustically determine displacements of audios sources (or portions thereof), such as a subset of speaking users, for providing audio spaces and spatial sound field reproduction, for example, for a remote listener. In one embodiment, a media device includes transducers to emit audible acoustic signals into a region including one or more audio sources, acoustic probe transducers configured to emit ultrasonic signals and acoustic sensors configured to sense received ultrasonic signals reflected from an audio source. A controller can determine a displacement of the audio source. Examples of displacement include locomotion, gesture-related motion and orientation changes.
Abstract:
The present invention provides methods and systems for digitally processing audio signals in two-channel audio systems and/or applications. In particular, the present invention includes a first filter structured to split a two- channel audio input signal into a low frequency signal and a higher frequency signal. A M/S splitter is then structured to split the higher frequency signal into a middle and a side signal. A detection module is then configured to create a detection signal from the middle signal, which is used in a compression module configured to modulate the side signal to create a gain-modulated side signal. A processing module is then structured to combine the low frequency signal, middle signal, and the gain-modulated side signal to form a final output signal.
Abstract:
A filter for equalizing the frequency response of loudspeaker systems includes at least one band filter section (11) comprised an n-order high boost or cut shelving filter (13) having a break point frequency,ω 1 , and an n-order low boost or cut shelving filter (15) having a break point frequency, ω 2, wherein ω 1 2. The order; n, of at least one, and preferably both of the shelving filters of the band filter sections can be selected for adjusting the slope of the shelving filter at one or both of its break point frequencies. The high and low n-order shelving filters forming the band filter sections have substantially the same gain and produce a resultant band gain for the band filter section. Gain correction is provided for the selectable n-order high shelving filter and n-order low shelving filter for correcting the resultant band gain to a base gain level.
Abstract:
In general, techniques are described for compressing decomposed representations of a sound field. A device comprising one or more processors may be configured to perform the techniques. The one or more processors may be configured to obtain a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.