Reduced reference canceller
    2.
    发明授权

    公开(公告)号:US11107488B1

    公开(公告)日:2021-08-31

    申请号:US16662696

    申请日:2019-10-24

    Abstract: A system configured to perform echo cancellation using a reduced number of reference signals. The system may perform multi-channel acoustic echo cancellation (MCAEC) processing on a first portion of a microphone audio signal that corresponds to early reflections and may perform single-channel acoustic echo cancellation (AEC) processing on a second portion of the microphone audio signal that corresponds to late reverberations. For example, the system may use MCAEC processing on a plurality of reference audio signals to generate a first echo estimate signal and may subtract the first echo estimate signal from the microphone audio signal to generate a residual audio signal. The system may delay the first echo estimate signal, perform the AEC processing to generate a second echo estimate signal, and subtract the second echo estimate signal from the residual audio signal to generate an output audio signal. This reduces an overall complexity associated with performing echo cancellation.

    Volume adjustment for listening environment

    公开(公告)号:US10147439B1

    公开(公告)日:2018-12-04

    申请号:US15474197

    申请日:2017-03-30

    Abstract: A speech-capturing device that can modulate its output audio data volume based on environmental sound conditions at the location of a user speaking to the device. The device detects the sound pressure of a spoken utterance at the device location and determines the distance of the user from the device. The device also detects the sound pressure of noise at the device and uses information about the location of the noise source and user to determine the sound pressure of noise at the location of the talker. The device can then adjust the gain for output audio (such as a spoken response to the utterance) to ensure that the output audio is at a certain desired sound pressure when it reaches the location of the user.

    Sound source localization using acoustic wave decomposition

    公开(公告)号:US12101599B1

    公开(公告)日:2024-09-24

    申请号:US17952806

    申请日:2022-09-26

    Inventor: Mohamed Mansour

    CPC classification number: H04R1/406 H04R3/005

    Abstract: Disclosed are techniques for an improved method for performing sound source localization (SSL) to determine a direction of arrival of an audible sound using a combination of timing information and amplitude information. For example, a device may decompose an observed sound field into directional components, then estimate a time-delay likelihood value and an energy-based likelihood value for each of the directional components. Using a combination of these likelihood values, the device can determine the direction of arrival corresponding to a maximum likelihood value. In some examples, the device may perform Acoustic Wave Decomposition processing to determine the directional components. In order to reduce a processing consumption associated with performing AWD processing, the device splits this process into two phases: a search phase that selects a subset of a device dictionary to reduce a complexity, and a decomposition phase that solves an optimization problem using the subset of the device dictionary.

    Multi-stage solver for acoustic wave decomposition

    公开(公告)号:US11785409B1

    公开(公告)日:2023-10-10

    申请号:US17529560

    申请日:2021-11-18

    Inventor: Mohamed Mansour

    CPC classification number: H04S7/302 G10L19/008 H04R3/005

    Abstract: Disclosed are techniques for an improved method for performing Acoustic Wave Decomposition (AWD) processing that reduces a complexity and processing consumption. The improved method enables a device to perform AWD processing to decompose an observed sound field into directional components, enabling the device to perform additional processing such as sound source separation, dereverberation, sound source localization, sound field reconstruction, and/or the like. The improved method splits the solution to two phases: a search phase that selects a subset of a device dictionary to reduce a complexity, and a decomposition phase that solves an optimization problem using the subset of the device dictionary.

    Filtering early reflections
    6.
    发明授权

    公开(公告)号:US11483644B1

    公开(公告)日:2022-10-25

    申请号:US17222275

    申请日:2021-04-05

    Inventor: Mohamed Mansour

    Abstract: A system that performs early reflections filtering to suppress early reflections and improve sound source localization (SSL). During music playback and/or when a device is placed in a corner, acoustic reflections from nearby surfaces get boosted due to constructive interference, negatively impacting SSL and other processing of the device. To suppress these early reflections, the device uses an Early Reflections Filter (ERF) that makes use of Linear Prediction Coding (LPC), which is already being performed during speech processing. For example, the device generates raw audio signals using multi-channel LPC coefficients and then uses single-channel LPC coefficients for each raw audio signal in order to generate a filter that estimates the reflections. The device then uses this filter to suppress the early reflections and generate filtered audio signals, thus resulting in better audio processing and better overall device performance.

    Residual echo suppression for keyword detection

    公开(公告)号:US11380312B1

    公开(公告)日:2022-07-05

    申请号:US16447550

    申请日:2019-06-20

    Inventor: Mohamed Mansour

    Abstract: A system configured to improve wakeword detection. The system may selectively rectify (e.g., attenuate) a portion of an audio signal based on energy statistics corresponding to a keyword (e.g., wakeword). For example, a device may perform echo cancellation to generate isolated audio data, may use the energy statistics to calculate signal quality metric values for a plurality of frequency bands of the isolated audio data, and may select a fixed number of frequency bands (e.g., 5-10%) associated with lowest signal quality metric values. To detect a specific keyword, the system determines a threshold λ(f) corresponding to an expected energy value at each frequency band. During runtime, the device determines signal quality metric values by subtracting residual music from the expected energy values. Thus, the device attenuates only a portion of the total number of frequency bands that include more energy than expected based on the energy statistics of the wakeword.

    Audio watermark encoding/decoding

    公开(公告)号:US10978081B2

    公开(公告)日:2021-04-13

    申请号:US16141578

    申请日:2018-09-25

    Abstract: A system may embed audio watermarks in audio data using a sign sequence. The system may detect audio watermarks in audio data despite the effects of reverberation. For example, the system may embed multiple repetitions of an audio watermark before generating output audio using loudspeaker(s). To detect the audio watermark in audio data generated by a microphone, the system may perform a self-correlation that indicates where the audio watermark is repeated. In some examples, the system may encode the audio watermark using multiple repetitions of a multi-segment Eigenvector. Additionally or alternatively, the system may encode the audio watermark using a binary sequence of positive and negative values, which may be used as a shared key for encoding/decoding the audio watermark. The audio watermark can be embedded in output audio data to enable wakeword suppression (e.g., avoid cross-talk between devices) and/or local signal transmission between devices in proximity to each other.

    SOUND SOURCE LOCALIZATION USING ACOUSTIC WAVE DECOMPOSITION

    公开(公告)号:US20250016499A1

    公开(公告)日:2025-01-09

    申请号:US18889896

    申请日:2024-09-19

    Inventor: Mohamed Mansour

    Abstract: Disclosed are techniques for an improved method for performing sound source localization (SSL) to determine a direction of arrival of an audible sound using a combination of timing information and amplitude information. For example, a device may decompose an observed sound field into directional components, then estimate a time-delay likelihood value and an energy-based likelihood value for each of the directional components. Using a combination of these likelihood values, the device can determine the direction of arrival corresponding to a maximum likelihood value. In some examples, the device may perform Acoustic Wave Decomposition processing to determine the directional components. In order to reduce a processing consumption associated with performing AWD processing, the device splits this process into two phases: a search phase that selects a subset of a device dictionary to reduce a complexity, and a decomposition phase that solves an optimization problem using the subset of the device dictionary.

    User localization
    10.
    发明授权

    公开(公告)号:US12143789B1

    公开(公告)日:2024-11-12

    申请号:US17825613

    申请日:2022-05-26

    Abstract: A system configured to improve user localization used to determine a listening position and/or user orientation for a device map. Multiple devices may generate audio data representing user speech and the system may use the audio data to determine a first spatial likelihood function (SLF) based on angle measurements, determine a second SLF based on timing information, and determine a location of the user based on a combination of the two SLFs. The SLFs represent the environment using a grid comprising a plurality of grid cells, and each grid cell has a value indicating a likelihood that the grid cell corresponds to the location of the user. An individual device may generate a portion of the angle measurements based on multi-channel audio data generated using multiple microphones of the device, while the system may generate the timing information based on single-channel audio data received from each of the multiple devices.

Patent Agency Ranking