NOISE SUPPRESSION
    65.
    发明申请

    公开(公告)号:US20180122399A1

    公开(公告)日:2018-05-03

    申请号:US15120130

    申请日:2015-03-02

    摘要: A noise suppressor comprises a first (401) and a second transformer (403) for generating a first and second frequency domain signal from a frequency transform of a first and second microphone signal. A gain unit (405, 407, 409) determines time frequency tile gains in response to a difference measure for magnitude time frequency tile values of the first frequency domain signal and magnitude time frequency tile values of the second frequency domain signal. A scaler (411) generates a third frequency domain signal by scaling time frequency tile values of the first frequency domain signal by the time frequency tile gains; and the resulting signal is converted to the time domain by a third transformer (413). A designator (405, 407, 415) designates time frequency tiles of the first frequency domain signal as speech tiles or noise tiles; and the gain unit (409) determines the gains in response to the designation of the time frequency tiles as speech tiles or noise tiles.

    STATE-BASED ENDPOINT CONFERENCE INTERACTION
    67.
    发明申请

    公开(公告)号:US20180041639A1

    公开(公告)日:2018-02-08

    申请号:US15667510

    申请日:2017-08-02

    摘要: Systems and methods are described for modifying one of far-end signal playback and capture of local audio on an audio device. Frames of both a far-end audio stream and a near-end audio stream may be analyzed using a measure of voice activity, the analyzing producing voice data associated with each frame. Based on the voice data, a conference state may be determined, and one of playback of the far-end audio stream and capture of local audio on an audio device may be modified based on the determined conference state. By associating the likely intent with a predefined state, the device may further cull or remove unwanted or unlikely content from the device input and output. This may have a substantial advantage in allowing for full duplex operation in the case of more meaningful and continuing voice activity, particularly in the case where there are many connected endpoints.

    PERMUTATION INVARIANT TRAINING FOR TALKER-INDEPENDENT MULTI-TALKER SPEECH SEPARATION

    公开(公告)号:US20170337924A1

    公开(公告)日:2017-11-23

    申请号:US15226527

    申请日:2016-08-02

    发明人: Dong Yu

    摘要: The techniques described herein improve methods to equip a computing device to conduct automatic speech recognition (“ASR”) in talker-independent multi-talker scenarios. In some examples, permutation invariant training of deep learning models can be used for talker-independent multi-talker scenarios. In some examples, the techniques can determine a permutation-considered assignment between a model's estimate of a source signal and the source signal. In some examples, the techniques can include training the model generating the estimate to minimize a deviation of the permutation-considered assignment. These techniques can be implemented into a neural network's structure itself, solving the label permutation problem that prevented making progress on deep learning based techniques for speech separation. The techniques discussed herein can also include source tracing to trace streams originating from a same source through the frames of a mixed signal.