Abstract:
Some disclosed methods involve receiving, by a control system, location control data from a sound source as the sound source emits sound in a plurality of sound source locations within an audio environment. Some such methods involve receiving, by the control system, direction of arrival data from each audio device of a plurality of audio devices in the audio environment. In some examples, each audio device of the plurality of audio devices includes a microphone array and the direction of arrival data corresponding to microphone signals from microphone arrays responsive to sound emitted by the sound source in the plurality of sound source locations. Some such methods involve estimating, by the control system, sound source locations and audio device locations based, at least in part, on the location control data and the direction of arrival data.
Abstract:
A method for estimating a user's location in an environment may involve receiving output signals from each microphone of a plurality of microphones in the environment. At least two microphones of the plurality of microphones may be included in separate devices at separate locations in the environment and the output signals may correspond to a current utterance of a user. The method may involve determining multiple current acoustic features from the output signals of each microphone and applying a classifier to the multiple current acoustic features. Applying the classifier may involve applying a model trained on previously-determined acoustic features derived from a plurality of previous utterances made by the user in a plurality of user zones in the environment. The method may involve determining, based at least in part on output from the classifier, an estimate of the user zone in which the user is currently located.
Abstract:
Some disclosed methods involve multi-band bass management. Some such examples may involve applying multiple high-pass and low-pass filter frequencies for the purpose of bass input management. Some disclosed methods treat at least some low-frequency signals as audio objects that can be panned. Some disclosed methods involve panning low and high frequencies separately. Following high-pass rendering, a power audit may determine a low-frequency deficit factor that is to be reproduced by subwoofers or other low-frequency-capable loudspeakers.
Abstract:
Spherical microphone arrays capture a three-dimensional sound field (P(Ωc, t)) for generating an Ambisonics representation (Anm(t)), where the pressure distribution on the surface of the sphere is sampled by the capsules of the array. The impact of the microphones on the captured sound field is removed using the inverse microphone transfer function. The equalisation of the transfer function of the microphone array is a big problem because the reciprocal of the transfer function causes high gains for small values in the transfer function and these small values are affected by transducer noise. The invention minimises that noise by using a Wiener filter processing in the frequency domain, which processing is automatically controlled per wave number by the signal-to-noise ratio of the microphone array.
Abstract:
A method for estimating a user's location in an environment may involve receiving output signals from each microphone of a plurality of microphones in the environment. At least two microphones of the plurality of microphones may be included in separate devices at separate locations in the environment and the output signals may correspond to a current utterance of a user. The method may involve determining multiple current acoustic features from the output signals of each microphone and applying a classifier to the multiple current acoustic features. Applying the classifier may involve applying a model trained on previously-determined acoustic features derived from a plurality of previous utterances made by the user in a plurality of user zones in the environment. The method may involve determining, based at least in part on output from the classifier, an estimate of the user zone in which the user is currently located.
Abstract:
A method for selecting a device for audio processing may involve receiving a first wakeword confidence metric from a first device that includes at least a first microphone and receiving a second wakeword confidence metric from a second device that includes at least a second microphone. The first and second wakeword confidence metrics may correspond to a first local maximum of a first plurality of wakeword confidence values determined by the first device and a second local maximum of a second plurality of wakeword confidence values determined by the second device. The method may involve comparing the first wakeword confidence metric and the second wakeword confidence metric and selecting a device for subsequent audio processing based, at least in part, on a comparison of the first wakeword confidence metric and the second wakeword confidence metric.
Abstract:
An audio session management method for an audio environment having multiple audio devices may involve receiving, from a first device implementing a first application and by a device implementing an audio session manager, a first route initiation request to initiate a first route for a first audio session. The first route initiation request may indicate a first audio source and a first audio environment destination. The first audio environment destination may correspond with at least a first person in the audio environment, but in some instances will not indicate an audio device. The method may involve establishing a first route corresponding to the first route initiation request. Establishing the first route may involve determining a first location of at least the first person in the audio environment, determining at least one audio device for a first stage of the first audio session and initiating or scheduling the first audio session.