-
公开(公告)号:US20180082695A1
公开(公告)日:2018-03-22
申请号:US15564125
申请日:2016-04-13
Inventor: Xuejing SUN , Dong SHI , Janusz KLEJSA
CPC classification number: G10L19/0017 , G10L19/0204 , G10L19/20 , G10L25/18 , G10L25/21 , G10L25/51 , G10L25/78 , H03M7/4037 , H03M7/6011 , H03M7/6017
Abstract: Disclosed is a system and computer program product of encoding audio content and corresponding method. The method includes determining a characteristic of the audio content, the characteristic of the audio content including at least one of a type or a property of the audio content. Also the method includes classifying the audio content based on the characteristic of the audio content and determining probabilities for multiple predefined audio coding symbols associated with the audio content by calculating a probability for each of the audio coding symbols based on the result of the classification, the probability for an audio coding symbol indicating a frequency at which the audio coding symbol occurs in the audio content. Further, the method encoded the audio content based on the audio coding symbols and the corresponding probabilities to obtain a code value, the code value representing a compression coding format of the audio content.
-
公开(公告)号:US20170125022A1
公开(公告)日:2017-05-04
申请号:US15369768
申请日:2016-12-05
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Shen HUANG , Xuejing SUN
IPC: G10L19/005
CPC classification number: G10L19/005 , G10L19/0017
Abstract: The present document relates to audio signal processing in general, and to the concealment of artifacts that results from loss of audio packets during audio transmission over a packet-switched network, in particular. A method (200) for concealing one or more consecutive lost packets (412, 413) is described. A lost packet (412) is a packet which is deemed to be lost by a transform-based audio decoder. Each of the one or more lost packets (412, 413) comprises a set of transform coefficients (313). A set of transform coefficients (313) is used by the transform-based audio decoder to generate a corresponding frame (412, 413) of a time domain audio signal. The method (200) comprises determining (205) for a current lost packet (412) of the one or more lost packets (412, 413) a number of preceding lost packets from the one or more lost packets (313); wherein the determined number is referred to as a loss position. Furthermore, the method comprises determining a packet loss concealment, referred to as PLC, scheme based on the loss position of the current packet; and determining (204, 207, 208) an estimate of a current frame (422) of the audio signal using the determined PLC scheme (204, 207, 208); wherein the current frame (422) corresponds to the current lost packet (412).
-
公开(公告)号:US20170118142A1
公开(公告)日:2017-04-27
申请号:US15397990
申请日:2017-01-04
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Glenn N. DICKINS , Xuejing SUN , Brendon COSTA
IPC: H04L12/861 , G10L25/78 , G10L19/16 , H04M3/56 , H04L29/06
CPC classification number: H04L49/90 , G10L19/167 , G10L25/78 , H04L65/1066 , H04M3/569
Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
-
公开(公告)号:US20150350099A1
公开(公告)日:2015-12-03
申请号:US14654346
申请日:2013-12-19
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Xuejing SUN , Zhiwei SHUANG
IPC: H04L12/875 , H04L12/841
CPC classification number: H04L47/56 , H04L47/2416 , H04L47/283 , H04L65/80
Abstract: Apparatus and methods for controlling a jitter buffer are described. In one embodiment, the apparatus for controlling a jitter buffer includes an inter-talkspurt delay jitter estimator for estimating an offset value of the delay of a first frame in the current talkspurt with respect to the delay of a latest anchor frame in a previous talkspurt, and a jitter buffer controller for adjusting a length of the jitter buffer based on a long term length of the jitter buffer for each frame and the offset value.
Abstract translation: 描述了用于控制抖动缓冲器的装置和方法。 在一个实施例中,用于控制抖动缓冲器的装置包括一个话音间距延迟抖动估计器,用于估计当前话音突发中的第一帧的延迟相对于先前讲话突发中的最新的锚帧的延迟的偏移值, 以及抖动缓冲器控制器,用于基于针对每个帧的抖动缓冲器的长期长度和偏移值来调整抖动缓冲器的长度。
-
公开(公告)号:US20220328060A1
公开(公告)日:2022-10-13
申请号:US17723317
申请日:2022-04-18
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Xuejing SUN , Glenn N. DICKINS
IPC: G10L21/0364 , G10L21/0316 , G10K11/16 , H03G3/32 , G10L21/0224 , G10L21/034 , G10L25/78 , H03G3/30
Abstract: A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.
-
公开(公告)号:US20220263423A9
公开(公告)日:2022-08-18
申请号:US16718964
申请日:2019-12-18
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Xuejing SUN , Zhiwei SHUANG
Abstract: Apparatus and methods for controlling a jitter buffer are described. In one embodiment, the apparatus for controlling a jitter buffer includes an inter-talkspurt delay jitter estimator for estimating an offset value of the delay of a first frame in the current talkspurt with respect to the delay of a latest anchor frame in a previous talkspurt, and a jitter buffer controller for adjusting a length of the jitter buffer based on a long term length of the jitter buffer for each frame and the offset value.
-
公开(公告)号:US20210029009A1
公开(公告)日:2021-01-28
申请号:US16988571
申请日:2020-08-07
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Kai LI , Xuejing SUN , Gary SPITTLE
Abstract: Some implementations involve analyzing audio packets received during a time interval that corresponds with a conversation analysis segment to determine network jitter dynamics data and conversational interactivity data. The network jitter dynamics data may provide an indication of jitter in a network that relays the audio data packets. The conversational interactivity data may provide an indication of interactivity between participants of a conversation represented by the audio data. A jitter buffer size may be controlled according to the network jitter dynamics data and the conversational interactivity data. The time interval may include a plurality of talkspurts.
-
18.
公开(公告)号:US20200344287A1
公开(公告)日:2020-10-29
申请号:US16927785
申请日:2020-07-13
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Shen HUANG , Doh-Suk KIM , Xuejing SUN
Abstract: A service request for communication services for communication clients is received. In response, a communication service network is set up to support the communication services. Routing metadata is generated for each of the communication clients. The routing metadata is to be used by each of the communication clients for sharing service quality information with a respective peer communication client over a light-weight peer-to-peer (P2P) network. The routing metadata is downloaded to each of the communication clients. A communication client may exchange service signaling packets or service data packets over the communication service network. When the communication client determines that there is a problematic region in a bitstream received from the communication server, the communication client can request a peer communication client for a service quality information portion related to the problematic region.
-
公开(公告)号:US20190387316A1
公开(公告)日:2019-12-19
申请号:US16554654
申请日:2019-08-29
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Dong SHI , Xuejing SUN
IPC: H04R3/04 , G10L25/18 , H03G5/02 , H03G3/30 , G10L25/48 , G10L25/21 , H03G5/16 , G10L21/0232 , G10L21/02 , G06F3/16
Abstract: Example embodiments disclosed herein relate to separated audio analysis and processing. A system for processing an audio signal is disclosed. The system includes an audio analysis module configured to analyze an input audio signal to determine a processing parameter for the input audio signal, the input audio signal being represented in time domain. The system also includes an audio processing module configured to process the input audio signal in parallel with the audio analysis module. The audio processing module includes a time domain filter configured to filter the input audio signal to obtain an output audio signal in the time domain, and a filter controller configured to control a filter coefficient of the time domain filter based on the processing parameter determined by the audio analysis module. Corresponding method and computer program product of processing an audio signal are also disclosed.
-
公开(公告)号:US20180336902A1
公开(公告)日:2018-11-22
申请号:US15546109
申请日:2016-02-03
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Richard J. CARTWRIGHT , Kai LI , Xuejing SUN
Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve analyzing conversational dynamics of the conference recording. Some examples may involve searching the conference recording to determine instances of segment classifications. The segment classifications may be based, at least in part, on conversational dynamics data. Some implementations may involve segmenting the conference recording into a plurality of segments, each of the segments corresponding with a time interval and at least one of the segment classifications. Some implementations allow a listener to scan through a conference recording quickly according to segments, words, topics and/or talkers of interest.
-
-
-
-
-
-
-
-
-