-
公开(公告)号:US20240196156A1
公开(公告)日:2024-06-13
申请号:US18436010
申请日:2024-02-07
Inventor: Rishabh TYAGI , Stefan BRUHN , Juan Felix TORRES
IPC: H04S7/00 , G10L19/008 , H04S3/00
CPC classification number: H04S7/304 , G10L19/008 , H04S3/008 , H04S2400/01 , H04S2400/03 , H04S2400/11
Abstract: An aspect of the present disclosure relates to processing audio comprising decoding a first bitstream (b1) to obtain decoded immersive audio content (A), decoding a second bitstream (bp) to obtain pose information (P, V, V′) associated with a user of a lightweight processing device, determining a first head-pose (P′) based on the pose information, providing a downmix representation (Dmx) of the immersive audio content (A) corresponding to the first head pose (P′), rendering a set of binaural representations (BINn) of the immersive audio content (A), wherein the binaural representations correspond to a second set of head poses (Pn), computing reconstruction metadata (M) to enable reconstruction of the set of binaural representations from the downmix representation (Dmx), the metadata (M) including the first head pose (P′), and encoding the downmix representation (Dmx) and the reconstruction metadata (M) in a third bitstream (b2).
-
公开(公告)号:US20240105192A1
公开(公告)日:2024-03-28
申请号:US18255506
申请日:2021-12-01
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Rishabh TYAGI , Michael ECKERT
IPC: G10L19/03 , G10L19/008 , G10L21/0216
CPC classification number: G10L19/03 , G10L19/008 , G10L21/0216
Abstract: Embodiments are disclosed for spatial noise filling in multi-channel codecs. In an embodiment, a method of regenerating background noise ambience in a multi-channel codec by generating spatial hole filling noise comprises: computing noise estimates based on a primary downmix channel generated from an input audio signal representing a spatial audio scene with background noise ambience; computing spectral shaping filter coefficients based on the noise estimates; spectrally shaping the multi-channel noise signal using the spectral shaping filter coefficients and a noise distribution, the spectral shaping resulting in a diffused, multi-channel noise signal with uncorrelated channels; spatially shaping the diffused, uncorrelated multi-channel noise signal with uncorrelated channels based on a noise ambience of the spatial audio scene; and adding the spatially and spectrally shaped multi-channel noise to a multi-channel codec output to synthesize the background noise ambience of the spatial audio scene.
-
公开(公告)号:US20240304196A1
公开(公告)日:2024-09-12
申请号:US18551134
申请日:2022-04-01
Inventor: Rishabh TYAGI , Heiko PURNHAGEN
IPC: G10L19/02 , G10L19/008
CPC classification number: G10L19/0204 , G10L19/008
Abstract: A method for multi-band ducking of audio signals is provided. In some implementations, the method involves receiving, at a decoder, an input audio signal, wherein the input audio signal is a downmixed audio signal. In some implementations, the method involves separating the input audio signal into a first set of frequency bands. In some implementations, the method involves determining a set of ducking gains, a ducking gain corresponding to a frequency band of the first set of frequency bands. In some implementations, the method involves generating a broadband decorrelated audio signal, wherein ducking gains of the set of ducking gains are applied to at least one of: 1) a second set of frequency bands prior to generating the at least one broadband decorrelated audio signal: or 2) a third set of frequency bands that separates the at least one broadband decorrelated audio signal.
-
公开(公告)号:US20220406318A1
公开(公告)日:2022-12-22
申请号:US17772497
申请日:2020-10-28
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Rishabh TYAGI , Juan Felix TORRES , Stefanie BROWN
IPC: G10L19/032 , G10L19/16 , G10L19/008
Abstract: Embodiments are disclosed for bitrate distribution in immersive voice and audio services. In an embodiment, a method of encoding an IVAS bitstream comprises: receiving an input audio signal; downmixing the input audio signal into one or more downmix channels and spatial metadata; reading a set of one or more bitrates for the downmix channels and a set of quantization levels for the spatial metadata from a bitrate distribution control table; determining a combination of the one or more bitrates for the downmix channels; determining a metadata quantization level from the set of metadata quantization levels using a bitrate distribution process; quantizing and coding the spatial metadata using the metadata quantization level; generating, using the combination of one or more bitrates, a downmix bitstream for the one or more downmix channels; combining the downmix bitstream, the quantized and coded spatial metadata and the set of quantization levels into the IVAS bitstream.
-
公开(公告)号:US20220293112A1
公开(公告)日:2022-09-15
申请号:US17635795
申请日:2020-09-01
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Rishabh TYAGI , David MCGRATH
IPC: G10L19/02 , G10L25/21 , G10L19/032 , G10L19/008 , G10L19/16 , G10L25/18
Abstract: In some implementations, a method of encoding a low-frequency effect (LFE) channel comprises: receiving a time-domain LFL channel signal; filtering, using a low-pass filter, the time-domain LFE channel signal; converting the filtered time-domain LFE channel signal into a frequency-domain representation of the LFE channel signal that includes a number of coefficients representing a frequency spectrum of the LFL channel signal; arranging coefficients into a number of subband groups corresponding to different frequency bands of the LFE channel signal; quantizing coefficients in each subband group according to a frequency response curve of the low-pass filter; encoding the quantized coefficients in each subband group using an entropy coder tuned for the subband group; and generating a bitstream including the encoded quantized coefficients; and storing the bitstream on a storage device or streaming the bitstream to a downstream device.
-
公开(公告)号:US20250095660A1
公开(公告)日:2025-03-20
申请号:US18729248
申请日:2023-01-09
Inventor: Stefanie BROWN , Stefan BRUHN , Rishabh TYAGI
IPC: G10L19/008 , G10L19/002 , G10L19/02 , G10L19/025 , G10L19/032 , G10L19/06
Abstract: Described herein is a method of encoding Higher Order Ambisonics, HOA, audio, the method including: receiving an input HOA audio signal having more than four Ambisonics channels; encoding the HOA audio signal using a SPAR coding framework and a core audio encoder; and providing the encoded HOA audio signal to a downstream device, the encoded HOA audio signal including core encoded SPAR downmix channels and encoded SPAR metadata. Further described are a method of decoding Higher Order Ambisonics, HOA, audio, respective apparatuses and computer program products.
-
公开(公告)号:US20230343346A1
公开(公告)日:2023-10-26
申请号:US18008445
申请日:2021-06-10
Applicant: Dolby Laboratories Licensing Corporation
Inventor: David S. MCGRATH , Rishabh TYAGI , Stefanie BROWN , Juan Felix Torres
IPC: G10L19/032 , G10L19/008
CPC classification number: G10L19/032 , G10L19/008
Abstract: Described is a method of frame-wise encoding metadata for an input signal, the metadata comprising a plurality of at least partially interrelated parameters calculable from the input signal. The method comprises, for each frame: iteratively performing, by using a looping process, steps of: determining a processing strategy from a plurality of processing strategies for calculating and quantizing the parameters; calculating and quantizing the parameters based on the determined processing strategy to obtain quantized parameters; and encoding the quantized parameters. In particular, each of the plurality of processing strategies comprises a respective first indication indicative of an ordering related to the calculation and quantization of individual parameters; and the processing strategy is determined based on at least one bitrate threshold.
-
8.
公开(公告)号:US20230215445A1
公开(公告)日:2023-07-06
申请号:US18000862
申请日:2021-06-10
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Michael ECKERT , Rishabh TYAGI
IPC: G10L19/008 , G10L25/78 , G10L21/0224
CPC classification number: G10L19/008 , G10L25/78 , G10L21/0224
Abstract: The present document describes a method (600) for encoding a multi-channel input signal (101) which comprises N different channels. The method (600) comprises, for a current frame of a sequence of frames, determining (601) whether the current frame is an active frame or an inactive frame using a signal and/or a voice activity detector, and determining (602) a downmix signal (103) based on the multi-channel input signal (101), wherein the downmix signal (103) comprises N channels or less. In addition, the method (600) comprises determining (603) upmixing metadata (105) comprising a set of parameters for generating, based on the downmix signal (103), a reconstructed multi-channel signal (111) comprising N channels, wherein the upmixing metadata (105) is determined in dependence of whether the current frame is an active frame or an inactive frame. The method (600) further comprises encoding (604) the upmixing metadata (105) into a bitstream.
-
-
-
-
-
-
-