-
公开(公告)号:US20170164133A1
公开(公告)日:2017-06-08
申请号:US15323724
申请日:2015-07-01
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: David GUNAWAN , Glenn N. DICKINS , Richard J. CARTWRIGHT
CPC classification number: H04S7/30 , H04R2420/01 , H04R2430/25 , H04S3/002 , H04S2400/11 , H04S2400/15 , H04S2420/05 , H04S2420/11
Abstract: A method for altering an audio signal of interest in a multi-channel soundfield representation of an audio enviroment, the method including the steps of: (a) extracting the signal of interest from the soundfield representation; (b) determining a residual soundfield signal; (c) inputting a further associated audio signal, which is associated with the signal of interest; (d) transforming the associated audio signal into a corresponding associated soundfield signal compatable with the residual soundfield; and (e) combining the residual soundfield signal with the associated soundfield signal to produce an output soundfield signal.
-
公开(公告)号:US20240363131A1
公开(公告)日:2024-10-31
申请号:US18577597
申请日:2022-07-12
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Jia DAI , Kai LI , Xiaoyu LIU , Richard J. CARTWRIGHT , Shaofan YANG
IPC: G10L21/0208 , G10L25/27
CPC classification number: G10L21/0208 , G10L25/27 , G10L2021/02082
Abstract: A method for dereverberating audio signals is provided. In some implementations, the method involves obtaining a real acoustic impulse response (AIR); identifying a first portion of the real AIR corresponding to early reflections of a direct sound and a second portion of the real AIR that corresponding to late reflections of the direct sound; generating one or more synthesized AIRs by modifying the first portion of the real AIR and/or the second portion of the real AIR; and using the real AIR and the one or more synthesized AIRs to generate a plurality of training samples, each training sample comprising an input audio signal and a reverberated audio signal, wherein the reverberated audio signal is generated based on the input audio signal and one of the real AIR or one of the one or more synthesized AIRs, which plurality of training samples are used to train a machine learning model.
-
公开(公告)号:US20240267469A1
公开(公告)日:2024-08-08
申请号:US18638588
申请日:2024-04-17
Inventor: Glenn N. DICKINS , Christopher Graham Hines , David GUNAWAN , Richard J. CARTWRIGHT , Alan J. SEEFELDT , Daniel ARTEAGA , Mark R.P. THOMAS , Joshua B. LANDO
CPC classification number: H04M9/082 , G10L15/22 , G10L2015/223
Abstract: An audio processing method may involve receiving output signals from each microphone of a plurality of microphones in an audio environment, the output signals corresponding to a current utterance of a person and determining, based on the output signals, one or more aspects of context information relating to the person, including an estimated current proximity of the person to one or more microphone locations. The method may involve selecting two or more loudspeaker-equipped audio devices based, at least in part, on the one or more aspects of the context information, determining one or more types of audio processing changes to apply to audio data being rendered to loudspeaker feed signals for the audio devices and causing one or more types of audio processing changes to be applied. In some examples, the audio processing changes have the effect of increasing a speech to echo ratio at one or more microphones.
-
公开(公告)号:US20240177726A1
公开(公告)日:2024-05-30
申请号:US18577586
申请日:2022-07-12
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Jia DAI , Kai LI , Xiaoyu LIU , Richard J. CARTWRIGHT
IPC: G10L21/0208 , G06N3/08 , G10L21/0232
CPC classification number: G10L21/0208 , G06N3/08 , G10L21/0232 , G10L2021/02082
Abstract: A method for enhancing audio signals is provided. In some implementations, the method involves (a) obtaining a training set comprising a plurality of training samples, each training sample comprising a distorted audio signal and a clean audio signal. In some implementations, the method involves (b), for a training sample of the plurality of training samples: obtaining a frequency-domain representation of the distorted audio signal; providing the frequency-domain representation to a convolutional neural network (CNN) comprising a plurality of convolutional layers and to a recurrent element, wherein an output of the recurrent element is provided to a subset of the plurality of convolutional layers; generating a predicted enhancement mask, wherein the CNN generates the predicted enhancement mask; generating a predicted enhanced audio signal based on the predicted enhancement mask; and updating weights associated with the CNN and the recurrent element based on the predicted enhanced audio signal.
-
公开(公告)号:US20220335937A1
公开(公告)日:2022-10-20
申请号:US17630895
申请日:2020-07-28
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Mark R. P. THOMAS , Richard J. CARTWRIGHT
Abstract: A method for estimating a user's location in an environment may involve receiving output signals from each microphone of a plurality of microphones in the environment. At least two microphones of the plurality of microphones may be included in separate devices at separate locations in the environment and the output signals may correspond to a current utterance of a user. The method may involve determining multiple current acoustic features from the output signals of each microphone and applying a classifier to the multiple current acoustic features. Applying the classifier may involve applying a model trained on previously-determined acoustic features derived from a plurality of previous utterances made by the user in a plurality of user zones in the environment. The method may involve determining, based at least in part on output from the classifier, an estimate of the user zone in which the user is currently located.
-
公开(公告)号:US20180295240A1
公开(公告)日:2018-10-11
申请号:US15578386
申请日:2016-06-15
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Glenn N. DICKINS , Richard J. CARTWRIGHT
IPC: H04M3/56 , G10L21/0232 , G10L21/0316
CPC classification number: H04M3/568 , G10L21/0232 , G10L21/0316 , G10L2021/02082 , H04M3/2281 , H04M3/42221
Abstract: Teleconference audio data including a plurality of individual uplink data packet streams, may be received during a teleconference. Each uplink data packet stream may corresponding to a telephone endpoint used by one or more teleconference participants. The teleconference audio data may be analyzed to determine a plurality of suppressive gain coefficients, which may be applied to first instances of the teleconference audio data during the teleconference, to produce first gain-suppressed audio data provided to the telephone endpoints during the teleconference. Second instances of the teleconference audio data, as well as gain coefficient data corresponding to the plurality of suppressive gain coefficients, may be sent to a memory system as individual uplink data packet streams. The second instances of the teleconference audio data may be less gain-suppressed than the first gain-suppressed audio data.
-
公开(公告)号:US20180279063A1
公开(公告)日:2018-09-27
申请号:US15547441
申请日:2016-02-03
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Xuejing SUN , Richard J. CARTWRIGHT , Michael P. HOLLIER , Michael ECKERT
IPC: H04S7/00 , G10L21/043
CPC classification number: H04S7/302 , G10L21/043 , H04L12/1831 , H04M3/42221 , H04M3/565 , H04M3/568 , H04M2203/305 , H04R27/00 , H04R2227/003 , H04S7/30 , H04S2400/11 , H04S2420/01
Abstract: A method for processing audio data, the method comprising: receiving audio data corresponding to a plurality of instances of audio, including at least one of: (a) audio data from multiple endpoints, recorded separately or (b) audio data from a single endpoint corresponding to multiple talkers and including spatial information for each of the multiple talkers; rendering the audio data in a virtual acoustic space such that each of the instances of audio has a respective different virtual position in the virtual acoustic space; and scheduling the instances of audio to be played back with a playback overlap between at least two of the instances of audio, wherein the scheduling is performed, at least in part, according to a set of perceptually-motivated rules.
-
公开(公告)号:US20170272375A1
公开(公告)日:2017-09-21
申请号:US15460490
申请日:2017-03-16
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Hannes MUESCH , Richard J. CARTWRIGHT
IPC: H04L12/875 , H04L12/26 , H04L12/841
CPC classification number: H04L47/56 , H04L43/087 , H04L47/283 , H04L47/50 , H04L65/604 , H04L65/80
Abstract: Disclosed is a method and apparatus operative to process packets of media received from a network including a receiver unit operative, a jitter buffer data structure and a playback head defining a point in the jitter buffer data structure from which the ordered queue of packets are to be played back, and at least one prototype head. Each prototype head having a predetermined latency assigned thereto and defining a point in the jitter buffer data structure from which the ordered queue of packets is being played back containing said latency a processor operable to determine a measure of conversational quality associated with the ordered queue of packets being played back by each prototype head. Also described is a head selector operable to compare the measures of conversational quality associated with the ordered queue of packets being played back by each prototype head to select the prototype head with the highest measure of conversational quality and a playback unit coupled to the playback head.
-
公开(公告)号:US20250086674A1
公开(公告)日:2025-03-13
申请号:US18883257
申请日:2024-09-12
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Dylan James HARPER-HARRIS , Richard J. CARTWRIGHT , Richard Walter MARTY , Siqi PAN , Jianbo MA , Ron GELLER , Daniel Steven TEMPLETON , Cedric JOGUET-RECCORDON , Yin-Lee HO , Benjamin SOUTHWELL
IPC: G06Q30/0242 , G06Q30/0282
Abstract: An apparatus may include an interface system and a first local control system. The first local control system may be configured to: receive first sensor data from a first preview environment while a content stream is being presented in the first preview environment; generate, based at least in part on the first sensor data, first user engagement data corresponding to one or more people in the first preview environment, the first user engagement data indicating estimated engagement with presented content of the content stream; output, via the interface system, either the first user engagement data, the first sensor data, or both, to a data aggregation device; and determine, based at least in part on user preference data, whether to provide at least some of the first user engagement data, at least some of the first sensor data, or both, to one or more machine learning (ML) models.
-
公开(公告)号:US20220270601A1
公开(公告)日:2022-08-25
申请号:US17626617
申请日:2020-07-30
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Christopher Graham HINES , Rowan James KATEKAR , Glenn N. DICKINS , Richard J. CARTWRIGHT , Jeremiha Emile DOUGLAS , Mark R.P. THOMAS
Abstract: A method may involve receiving output signals from each microphone of a plurality of microphones in the environment, each of the plurality of microphones residing in a microphone location of the environment, the output signals corresponding to an utterance of a person. The method may involve determining, based at least in part on the output signals, a zone within the environment that has at least a threshold probability of including the person's location and generating a plurality of spatially-varying attentiveness signals within the zone. Each attentiveness signal may be generated by a device located within the zone. Each attentiveness signal may indicate that a corresponding device is in an operating mode in which the corresponding device is awaiting a command and may indicate a relevance metric of the corresponding device.
-
-
-
-
-
-
-
-
-