-
公开(公告)号:US20240395267A1
公开(公告)日:2024-11-28
申请号:US18674555
申请日:2024-05-24
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Jia Dai , Kai Li , Richard J. Cartwright
Abstract: The present disclosure relates to the field of audio enhancement, and in particular to methods, devices and software for supervised training of a machine learning model, MLM, the MLM trained to enhance a degraded audio signal by calculating gains to be applied to frequency bands of the degraded audio signal. The present disclosure further relates to methods, devices and software for use of such a trained MLM.
-
公开(公告)号:US20240290341A1
公开(公告)日:2024-08-29
申请号:US18571963
申请日:2022-06-28
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Kai Li , Jia Dai , Xiaoyu Liu
IPC: G10L21/0232 , G10L21/0208 , G10L25/21 , G10L25/30
CPC classification number: G10L21/0232 , G10L25/21 , G10L25/30 , G10L2021/02082 , G10L2021/02087
Abstract: A system for mitigating over-suppression of speech and other non-noise signals is disclosed. In some embodiments, a system is programmed to train a first machine learning model for speech detection or enhancement using a non-linear, asymmetric loss function that penalizes speech over-suppression more than speech under-suppression. The first machine learning model is configured to receive an audio signal and generate a mask indicating an amount of speech present in the audio signal. The mask can be adjusted to remedy sharp voice decay resulting from speech over-suppression. The system is also programmed to train a second machine learning model for laughter or applause detection. The system is further programmed to improve the quality of a new audio signal by applying an adjusted mask to the new audio signal except for the portions of the audio signal that have been identified as corresponding to laughter or applause.
-
公开(公告)号:US11632318B2
公开(公告)日:2023-04-18
申请号:US16988571
申请日:2020-08-07
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Kai Li , Xuejing Sun , Gary Spittle
IPC: H04L43/0852 , H04L65/80 , H04J3/06 , H04L65/00 , G10L15/08 , G10L25/93 , H04L43/16 , H04L65/403 , G10L25/48 , G10L25/78
Abstract: Some implementations involve analyzing audio packets received during a time interval that corresponds with a conversation analysis segment to determine network jitter dynamics data and conversational interactivity data. The network jitter dynamics data may provide an indication of jitter in a network that relays the audio data packets. The conversational interactivity data may provide an indication of interactivity between participants of a conversation represented by the audio data. A jitter buffer size may be controlled according to the network jitter dynamics data and the conversational interactivity data. The time interval may include a plurality of talkspurts.
-
公开(公告)号:US20220383889A1
公开(公告)日:2022-12-01
申请号:US17627116
申请日:2020-07-16
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Yuanxing Ma , Kai Li , Qianqian Fang
IPC: G10L21/0216 , G10L25/18
Abstract: A method is disclosed herein for adapting parameters of a sibilance detector. Time-frequency features are extracted from an audio signal being received and. Based on those time-frequency features, a determination is made of whether the audio signal includes a short-term feature or a long-term feature. In accordance with determining that the audio signal includes the short-term feature or the long-term feature, one or more parameters of a sibilance detector for detecting sibilance in the audio signal are adapted. Sibilance in the audio signal, is detected using the sibilance detector with the one or more adapted parameters.
-
公开(公告)号:US10522151B2
公开(公告)日:2019-12-31
申请号:US15546109
申请日:2016-02-03
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Richard J. Cartwright , Kai Li , Xuejing Sun
IPC: G10L17/00 , G06N20/00 , G06F16/61 , G06F16/68 , H04M3/42 , H04M3/56 , G10L25/48 , G06F17/27 , G10L17/02 , G10L25/78 , G10L15/26
Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve analyzing conversational dynamics of the conference recording. Some examples may involve searching the conference recording to determine instances of segment classifications. The segment classifications may be based, at least in part, on conversational dynamics data. Some implementations may involve segmenting the conference recording into a plurality of segments, each of the segments corresponding with a time interval and at least one of the segment classifications. Some implementations allow a listener to scan through a conference recording quickly according to segments, words, topics and/or talkers of interest.
-
公开(公告)号:US09871912B2
公开(公告)日:2018-01-16
申请号:US15109511
申请日:2015-01-06
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Kai Li , Glenn N. Dickins , Xuejing Sun
IPC: H04M1/24 , H04M3/08 , H04M3/22 , H04M3/00 , H04M5/00 , H04M3/56 , G10L25/63 , G10L25/60 , H04L12/24
CPC classification number: H04M3/2227 , G10L25/60 , G10L25/63 , H04L41/5009 , H04M3/2236 , H04M3/568 , H04M3/569 , H04M2203/2038
Abstract: In a conference call having a plurality of participants interacting in a conference exchange of information in a digital transmission environment, the interaction being across a variable network transmission resource, a method of allocating the level of transmission resource, the methods including the steps of: (a) monitoring predetermined aspects of the participant's behavior during the conference call; (b) determining a divergence of participants behavior from normative values; (c) utilizing any divergence as an indicator of aberrant operation of the participants; and (d) allocating the resource determinative on the divergence of participants behavior from normative values.
-
公开(公告)号:US09653092B2
公开(公告)日:2017-05-16
申请号:US14651564
申请日:2013-11-25
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Xuejing Sun , Dong Shi , Kai Li
IPC: G10L21/00 , H04M9/08 , G10L21/0208 , G10L21/0264 , H04B3/23 , H04L12/66 , H04J3/10 , H04L12/16 , H04B3/20 , H04B1/38 , H04L27/00 , G06F17/00 , H04L12/861
CPC classification number: G10L21/0208 , G10L21/0264 , G10L2021/02082 , H04B3/23 , H04L49/90 , H04M9/082
Abstract: A method for controlling acoustic echo cancellation and an audio processing apparatus are described. In one embodiment, the audio processing apparatus includes an acoustic echo canceller for suppressing acoustic echo in a microphone signal, a jitter buffer for reducing delay jitter of a received signal, and a joint controller for controlling the acoustic echo canceller by referring to at least one future frame in the jitter buffer.
-
公开(公告)号:US11996108B2
公开(公告)日:2024-05-28
申请号:US17632220
申请日:2020-07-30
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Jia Dai , Kai Li , Richard J. Cartwright
CPC classification number: G10L19/0208 , G06N20/00 , G10L19/005 , G10L25/18 , G10L25/21 , H04M3/568
Abstract: The present disclosure relates to the field of audio enhancement, and in particular to methods, devices and software for supervised training of a machine learning model, MLM, the MLM trained to enhance a degraded audio signal by calculating gains to be applied to frequency bands of the degraded audio signal. The present disclosure further relates to methods, devices and software for use of such a trained MLM.
-
公开(公告)号:US20220319526A1
公开(公告)日:2022-10-06
申请号:US17639286
申请日:2020-08-27
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Yanmeng Guo , Kai Li
IPC: G10L19/008
Abstract: A method for channel identification of a multi-channel audio signal comprising X>1 channels is provided. The method comprises the steps of: identifying, among the X channels, any empty channels, thus resulting in a subset of Y≤X non-empty channels; determining whether a low frequency effect (LFE) channel is present among the Y channels, and upon determining that an LFE channel is present, identifying the determined channel among the Y channels as the LFE channel; dividing the remaining channels among the Y channels not being identified as the LFE channel into any number of pairs of channels by matching symmetrical channels; and identifying any remaining unpaired channel among the Y channels not being identified as the LFE channel or divided into pairs as a center channel.
-
公开(公告)号:US20220270625A1
公开(公告)日:2022-08-25
申请号:US17632220
申请日:2020-07-30
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Jia Dai , Kai Li , Richard J. Cartwright
IPC: G10L19/02 , G10L25/18 , G10L25/21 , G10L19/005 , G06N20/00
Abstract: The present disclosure relates to the field of audio enhancement, and in particular to methods, devices and software for supervised training of a machine learning model, MLM, the MLM trained to enhance a degraded audio signal by calculating gains to be applied to frequency bands of the degraded audio signal. The present disclosure further relates to methods, devices and software for use of such a trained MLM.
-
-
-
-
-
-
-
-
-