摘要:
An augmented reality environment allows interaction between virtual and real objects. Beamforming techniques are applied to signals acquired by an array of microphones to allow for simultaneous spatial tracking and signal acquisition from multiple users. Localization information such as from other sensors in the environment may be used to select a particular set of beamformer coefficients and resulting beampattern focused on a signal source. Alternately, a series of beampatterns may be used iteratively to localize the signal source in a computationally efficient fashion. The beamformer coefficients may be pre-computed.
摘要:
An augmented reality environment allows interaction between virtual and real objects. Beamforming techniques are applied to signals acquired by an array of microphones to allow for simultaneous spatial tracking and signal acquisition from multiple users. Localization information such as from other sensors in the environment may be used to select a particular set of beamformer coefficients and resulting beampattern focused on a signal source. Alternately, a series of beampatterns may be used iteratively to localize the signal source in a computationally efficient fashion. The beamformer coefficients may be pre-computed.
摘要:
A plurality of microphones of a communication device is grouped into multiple microphone groups, such that each microphone group includes two or more microphones. For each microphone group, output of the corresponding microphones is processed to form an acoustic null in a corresponding spatial direction, such that sound from the corresponding spatial direction is attenuated in the processed output. One of the microphone groups is selected based on various factors leading to maximal echo attenuation and rejection of reverberant components of the room. The selected microphone group is then used to detect sound from a near end talker of the communication device.
摘要:
An automatic speech recognition engine receives an acoustic-echo processed signal from an acoustic-echo processing (AEP) module, where said echo processed signal contains mainly the speech from the near-end talker. The automatic speech recognition engine analyzes the content of the acoustic-echo processed signal to determine whether words or keywords are present. Based upon the results of this analysis, the automatic speech recognition engine produces a value reflecting the likelihood that some words or keywords are detected. Said value is provided to the AEP module. Based upon the value, the AEP module determines if there is double talk and processes the incoming signals accordingly to enhance its performance.
摘要:
Techniques for enhancing an acoustic echo canceller based on visual cues are described herein. The techniques include changing adaptation of a filter of the acoustic echo canceller, calibrating the filter, or reducing background noise from an audio signal processed by the acoustic echo canceller. The changing, calibrating, and reducing are responsive to visual cues that describe acoustic characteristics of a location of a device that includes the acoustic echo canceller. Such visual cues may indicate that no human being is present at the location, that some subject(s) are engaged in speaking or sound generating activities, or that motion associated with an echo path change has occurred at the location.
摘要:
Techniques for utilizing blind source separation as a front-end to an acoustic echo canceller are described herein. The techniques include removing a first portion of an acoustic echo from an audio signal using blind source separation and a reference signal. The techniques then further remove a second portion of the acoustic echo using an acoustic echo canceller and the reference signal. Further, output of the blind source separation may be used to improve double-talk detection.