MACHINE LEARNING ASSISTED SPATIAL NOISE ESTIMATION AND SUPPRESSION

    公开(公告)号:US20230410829A1

    公开(公告)日:2023-12-21

    申请号:US18251876

    申请日:2021-11-04

    CPC classification number: G10L21/0232 G10L25/84 G10L25/30 G10L2021/02166

    Abstract: In an embodiment, a method comprises: receiving bands of power spectra of an input audio signal and a microphone covariance, and for each band: estimating, using a classifier, respective probabilities of speech and noise; estimating, using a directionality model, a set of means for speech and noise, or a set of means and covariances for speech and noise, based on the microphone covariance for the band and the probabilities; estimating, using a level model, a mean and covariance of noise power based on the probabilities and the power spectra; determining a first noise suppression gain based on the directionality model; determining a second noise suppression gain based on the level model; selecting the first or second noise suppression gain or their sum based on a signal-to-noise ratio of the input audio signal; and scaling a time-frequency representation of the input signal by the selected noise suppression gain.

    CONFERENCE SEGMENTATION BASED ON CONVERSATIONAL DYNAMICS

    公开(公告)号:US20180336902A1

    公开(公告)日:2018-11-22

    申请号:US15546109

    申请日:2016-02-03

    Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve analyzing conversational dynamics of the conference recording. Some examples may involve searching the conference recording to determine instances of segment classifications. The segment classifications may be based, at least in part, on conversational dynamics data. Some implementations may involve segmenting the conference recording into a plurality of segments, each of the segments corresponding with a time interval and at least one of the segment classifications. Some implementations allow a listener to scan through a conference recording quickly according to segments, words, topics and/or talkers of interest.

    SELECTIVE CONFERENCE DIGEST
    3.
    发明申请

    公开(公告)号:US20180191912A1

    公开(公告)日:2018-07-05

    申请号:US15548265

    申请日:2016-02-03

    Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve receiving audio data corresponding to a recording of at least one conference involving a plurality of conference participants. In some examples, only a portion of the received audio data will be selected as playback audio data. The selection process may involve a topic selection process, a talkspurt filtering process and/or an acoustic feature selection process. Some examples involve receiving an indication of a target playback time duration. Selecting the portion of audio data may involve making a time duration of the playback audio data within a threshold time difference of the target playback time duration.

    ADAPTIVE AUDIO CONSTRUCTION
    4.
    发明申请

    公开(公告)号:US20180014139A1

    公开(公告)日:2018-01-11

    申请号:US15547043

    申请日:2016-02-02

    Abstract: Described herein is a method for creating an object-based audio signal from an audio input, the audio input including one or more audio channels that are recorded to collectively define an audio scene. The one or more audio channels are captured from a respective one or more spatially separated microphones disposed in a stable spatial configuration. The method includes the steps of: a) receiving the audio input; b) performing spatial analysis on the one or more audio channels to identify one or more audio objects within the audio scene; c) determining contextual information relating to the one or more audio objects; d) defining respective audio streams including audio data relating to at least one of the identified one or more audio objects; and e) outputting an object-based audio signal including the audio streams and the contextual information.

    POST-CONFERENCE PLAYBACK SYSTEM HAVING HIGHER PERCEIVED QUALITY THAN ORIGINALLY HEARD IN THE CONFERENCE

    公开(公告)号:US20180006837A1

    公开(公告)日:2018-01-04

    申请号:US15546925

    申请日:2016-02-03

    Abstract: Some aspects of the present disclosure involve the recording, processing and playback of audio data corresponding to conferences, such as teleconferences. In some teleconference implementations, the audio experience heard when a recording of the conference is played back may be substantially different from the audio experience of an individual conference participant during the original teleconference. In some implementations, the recorded audio data may include at least some audio data that was not available during the teleconference. In some examples, the spatial characteristics of the played-back audio data may be different from that of the audio heard by participants of the teleconference.

    METHOD OF RENDERING ONE OR MORE CAPTURED AUDIO SOUNDFIELDS TO A LISTENER

    公开(公告)号:US20220030370A1

    公开(公告)日:2022-01-27

    申请号:US17397887

    申请日:2021-08-09

    Abstract: A computer implemented system for rendering captured audio soundfields to a listener comprises apparatus to deliver the audio soundfields to the listener. The delivery apparatus delivers the audio soundfields to the listener with first and second audio elements perceived by the listener as emanating from first and second virtual source locations, respectively, and with the first audio element and/or the second audio element delivered to the listener from a third virtual source location. The first virtual source location and the second virtual source location are perceived by the listener as being located to the front of the listener, and the third virtual source location is located to the rear or the side of the listener.

    METHOD OF RENDERING ONE OR MORE CAPTURED AUDIO SOUNDFIELDS TO A LISTENER

    公开(公告)号:US20200021935A1

    公开(公告)日:2020-01-16

    申请号:US16518666

    申请日:2019-07-22

    Abstract: A computer implemented system for rendering captured audio soundfields to a listener comprises apparatus to deliver the audio soundfields to the listener. The delivery apparatus delivers the audio soundfields to the listener with first and second audio elements perceived by the listener as emanating from first and second virtual source locations, respectively, and with the first audio element and/or the second audio element delivered to the listener from a third virtual source location. The first virtual source location and the second virtual source location are perceived by the listener as being located to the front of the listener, and the third virtual source location is located to the rear or the side of the listener.

Patent Agency Ranking