Surface augmented ray-based acoustic modeling

    公开(公告)号:US11830471B1

    公开(公告)日:2023-11-28

    申请号:US17007681

    申请日:2020-08-31

    Abstract: Disclosed are techniques for performing ray-based acoustic modeling that models scattering of acoustic waves by a surface of a device. The acoustic modeling uses two parameters, a room response representing acoustics and geometry of a room and a device response representing acoustics and geometry of the device. The room response is determined using ray-based acoustic modeling, such as ray tracing. The device response can be measured in an actual environment or simulated and represents an acoustic response of the device to individual acoustic plane waves. The device applies a superposition of the room response and the plane wave scattering from the device response to determine acoustic pressure values and generate microphone audio data. The device can estimate room impulse response (RIR) data using data from the microphones, and can use the RIR data to perform audio processing such as sound equalization, acoustic echo cancellation, audio beamforming, and/or the like.

    Cascade echo cancellation for asymmetric references

    公开(公告)号:US11222647B2

    公开(公告)日:2022-01-11

    申请号:US16934668

    申请日:2020-07-21

    Abstract: A system configured to perform cascade echo cancellation processing to improve a performance when reference signals are asymmetric (e.g., dominant reference signal(s) overshadow weak reference signal(s)). The system may perform cascade echo cancellation processing to separately adapt filter coefficients between the dominant reference signal(s) and the weak reference signal(s). For example, the system may use a dominant reference signal to process a microphone audio signal and generate a residual audio signal, using the residual audio signal to adapt first filter coefficient values corresponding to the dominant reference signal. Separately, the system may use a weak reference signal to process the residual audio signal and generate an output audio signal, using the output audio signal to adapt second filter coefficient values corresponding to the weak reference signal.

    Modeling room acoustics using acoustic waves

    公开(公告)号:US10582299B1

    公开(公告)日:2020-03-03

    申请号:US16216599

    申请日:2018-12-11

    Abstract: Techniques for simulating a microphone array and generating synthetic audio data to analyze the microphone array geometry. This reduces the development cost of new microphone arrays by enabling an evaluation of performance metrics (False Rejection Rate (FRR), Word Error Rate (WER), etc.) without building device hardware or collecting data. To generate the synthetic audio data, the system performs acoustic modeling to determine a room impulse response associated with a prototype device (e.g., potential microphone array) in a room. The acoustic modeling is based on two parameters—a device response (information about acoustics and geometry of the prototype device) and a room response (information about acoustics and geometry of the room). The device response can be simulated based on the microphone array geometry, and the room response can be determined using a specialized microphone and a plane wave decomposition algorithm.

    Sound source localization using wave decomposition

    公开(公告)号:US11425495B1

    公开(公告)日:2022-08-23

    申请号:US17234233

    申请日:2021-04-19

    Inventor: Mohamed Mansour

    Abstract: A system that performs sound source localization (SSL) using acoustic wave decomposition (AWD) or an approximation. When a device detects a wakeword represented in audio data, the device performs SSL processing in order to determine a position of the user relative to the device (e.g., estimate angle of the user). The device calculates noise statistics based on first audio data representing the wakeword and second audio data preceding the wakeword. Thus, upon detecting the wakeword, the device calculates the noise statistics and a signal quality metric corresponding to the wakeword. In addition, the device uses Multi-Channel Linear Prediction Coding (MCLPC) coefficients to average out the room impulse response. Using the noise statistics, the MCLPC coefficients, and the audio data, the device performs AWD processing to decompose the sound field to disjoint acoustic plane waves, enabling the device to identify the most likely direction for the line-of-sight component of speech.

    Modeling room acoustics using acoustic waves

    公开(公告)号:US10986444B2

    公开(公告)日:2021-04-20

    申请号:US16798706

    申请日:2020-02-24

    Abstract: Techniques for simulating a microphone array and generating synthetic audio data to analyze the microphone array geometry. This reduces the development cost of new microphone arrays by enabling an evaluation of performance metrics (False Rejection Rate (FRR), Word Error Rate (WER), etc.) without building device hardware or collecting data. To generate the synthetic audio data, the system performs acoustic modeling to determine a room impulse response associated with a prototype device (e.g., potential microphone array) in a room. The acoustic modeling is based on two parameters—a device response (information about acoustics and geometry of the prototype device) and a room response (information about acoustics and geometry of the room). The device response can be simulated based on the microphone array geometry, and the room response can be determined using a specialized microphone and a plane wave decomposition algorithm.

    Aligned beam merger
    16.
    发明授权

    公开(公告)号:US10887709B1

    公开(公告)日:2021-01-05

    申请号:US16582820

    申请日:2019-09-25

    Abstract: A system configured to perform aligned beam merger (ABM) processing to combine multiple beamformed signals. The system may capture audio data and perform beamforming to generate beamformed audio signals corresponding to a plurality of directions. The system may apply an ABM algorithm to select a number of the beamformed audio signals, align the selected audio signals, and merge the selected audio signals to generate a distortionless output audio signal. The system may scale the selected audio signals based on relative magnitude and apply a complex correction factor to compensate for a phase error for each of the selected audio signals.

    Cascade echo cancellation for asymmetric references

    公开(公告)号:US10811029B1

    公开(公告)日:2020-10-20

    申请号:US16669980

    申请日:2019-10-31

    Abstract: A system configured to perform cascade echo cancellation processing to improve a performance when reference signals are asymmetric (e.g., dominant reference signal(s) overshadow weak reference signal(s)). The system may perform cascade echo cancellation processing to separately adapt filter coefficients between the dominant reference signal(s) and the weak reference signal(s). For example, the system may use a dominant reference signal to process a microphone audio signal and generate a residual audio signal, using the residual audio signal to adapt first filter coefficient values corresponding to the dominant reference signal. Separately, the system may use a weak reference signal to process the residual audio signal and generate an output audio signal, using the output audio signal to adapt second filter coefficient values corresponding to the weak reference signal.

    AUDIO WATERMARK ENCODING/DECODING
    18.
    发明申请

    公开(公告)号:US20200098380A1

    公开(公告)日:2020-03-26

    申请号:US16141578

    申请日:2018-09-25

    Abstract: A system may embed audio watermarks in audio data using a sign sequence. The system may detect audio watermarks in audio data despite the effects of reverberation. For example, the system may embed multiple repetitions of an audio watermark before generating output audio using loudspeaker(s). To detect the audio watermark in audio data generated by a microphone, the system may perform a self-correlation that indicates where the audio watermark is repeated. In some examples, the system may encode the audio watermark using multiple repetitions of a multi-segment Eigenvector. Additionally or alternatively, the system may encode the audio watermark using a binary sequence of positive and negative values, which may be used as a shared key for encoding/decoding the audio watermark. The audio watermark can be embedded in output audio data to enable wakeword suppression (e.g., avoid cross-talk between devices) and/or local signal transmission between devices in proximity to each other.

    System to determine direction toward user

    公开(公告)号:US11714157B2

    公开(公告)日:2023-08-01

    申请号:US17174941

    申请日:2021-02-12

    CPC classification number: G01S3/8003 G01S3/7864 H04R1/406 H04R3/005

    Abstract: A device has a microphone array that acquires sound data and a camera that acquires image data. A portion of the device may be moveable by one or more actuators. Responsive to the user, the portion of the device is moved toward an estimated direction of the user. The estimated direction is based on sensor data including the sound data and the image data. First variance values for individual sound direction values are calculated. Data derived from the image data or data from other sensors may be used to modify the first variance values and determine second data comprising second variances. The second data may be processed to determine the estimated direction of the user. For example, the second data may be processed by both a forward and a backward Kalman filter, and the output combined to determine an estimated direction toward the user.

Patent Agency Ranking