BINARUAL RENDERING
    1.
    发明公开
    BINARUAL RENDERING 审中-公开

    公开(公告)号:US20240196156A1

    公开(公告)日:2024-06-13

    申请号:US18436010

    申请日:2024-02-07

    Abstract: An aspect of the present disclosure relates to processing audio comprising decoding a first bitstream (b1) to obtain decoded immersive audio content (A), decoding a second bitstream (bp) to obtain pose information (P, V, V′) associated with a user of a lightweight processing device, determining a first head-pose (P′) based on the pose information, providing a downmix representation (Dmx) of the immersive audio content (A) corresponding to the first head pose (P′), rendering a set of binaural representations (BINn) of the immersive audio content (A), wherein the binaural representations correspond to a second set of head poses (Pn), computing reconstruction metadata (M) to enable reconstruction of the set of binaural representations from the downmix representation (Dmx), the metadata (M) including the first head pose (P′), and encoding the downmix representation (Dmx) and the reconstruction metadata (M) in a third bitstream (b2).

    SPATIAL NOISE FILLING IN MULTI-CHANNEL CODEC

    公开(公告)号:US20240105192A1

    公开(公告)日:2024-03-28

    申请号:US18255506

    申请日:2021-12-01

    CPC classification number: G10L19/03 G10L19/008 G10L21/0216

    Abstract: Embodiments are disclosed for spatial noise filling in multi-channel codecs. In an embodiment, a method of regenerating background noise ambience in a multi-channel codec by generating spatial hole filling noise comprises: computing noise estimates based on a primary downmix channel generated from an input audio signal representing a spatial audio scene with background noise ambience; computing spectral shaping filter coefficients based on the noise estimates; spectrally shaping the multi-channel noise signal using the spectral shaping filter coefficients and a noise distribution, the spectral shaping resulting in a diffused, multi-channel noise signal with uncorrelated channels; spatially shaping the diffused, uncorrelated multi-channel noise signal with uncorrelated channels based on a noise ambience of the spatial audio scene; and adding the spatially and spectrally shaped multi-channel noise to a multi-channel codec output to synthesize the background noise ambience of the spatial audio scene.

    MULTI-BAND DUCKING OF AUDIO SIGNALS
    3.
    发明公开

    公开(公告)号:US20240304196A1

    公开(公告)日:2024-09-12

    申请号:US18551134

    申请日:2022-04-01

    CPC classification number: G10L19/0204 G10L19/008

    Abstract: A method for multi-band ducking of audio signals is provided. In some implementations, the method involves receiving, at a decoder, an input audio signal, wherein the input audio signal is a downmixed audio signal. In some implementations, the method involves separating the input audio signal into a first set of frequency bands. In some implementations, the method involves determining a set of ducking gains, a ducking gain corresponding to a frequency band of the first set of frequency bands. In some implementations, the method involves generating a broadband decorrelated audio signal, wherein ducking gains of the set of ducking gains are applied to at least one of: 1) a second set of frequency bands prior to generating the at least one broadband decorrelated audio signal: or 2) a third set of frequency bands that separates the at least one broadband decorrelated audio signal.

    BITRATE DISTRIBUTION IN IMMERSIVE VOICE AND AUDIO SERVICES

    公开(公告)号:US20220406318A1

    公开(公告)日:2022-12-22

    申请号:US17772497

    申请日:2020-10-28

    Abstract: Embodiments are disclosed for bitrate distribution in immersive voice and audio services. In an embodiment, a method of encoding an IVAS bitstream comprises: receiving an input audio signal; downmixing the input audio signal into one or more downmix channels and spatial metadata; reading a set of one or more bitrates for the downmix channels and a set of quantization levels for the spatial metadata from a bitrate distribution control table; determining a combination of the one or more bitrates for the downmix channels; determining a metadata quantization level from the set of metadata quantization levels using a bitrate distribution process; quantizing and coding the spatial metadata using the metadata quantization level; generating, using the combination of one or more bitrates, a downmix bitstream for the one or more downmix channels; combining the downmix bitstream, the quantized and coded spatial metadata and the set of quantization levels into the IVAS bitstream.

    LOW-LATENCY, LOW-FREQUENCY EFFECTS CODEC

    公开(公告)号:US20220293112A1

    公开(公告)日:2022-09-15

    申请号:US17635795

    申请日:2020-09-01

    Abstract: In some implementations, a method of encoding a low-frequency effect (LFE) channel comprises: receiving a time-domain LFL channel signal; filtering, using a low-pass filter, the time-domain LFE channel signal; converting the filtered time-domain LFE channel signal into a frequency-domain representation of the LFE channel signal that includes a number of coefficients representing a frequency spectrum of the LFL channel signal; arranging coefficients into a number of subband groups corresponding to different frequency bands of the LFE channel signal; quantizing coefficients in each subband group according to a frequency response curve of the low-pass filter; encoding the quantized coefficients in each subband group using an entropy coder tuned for the subband group; and generating a bitstream including the encoded quantized coefficients; and storing the bitstream on a storage device or streaming the bitstream to a downstream device.

    QUANTIZATION AND ENTROPY CODING OF PARAMETERS FOR A LOW LATENCY AUDIO CODEC

    公开(公告)号:US20230343346A1

    公开(公告)日:2023-10-26

    申请号:US18008445

    申请日:2021-06-10

    CPC classification number: G10L19/032 G10L19/008

    Abstract: Described is a method of frame-wise encoding metadata for an input signal, the metadata comprising a plurality of at least partially interrelated parameters calculable from the input signal. The method comprises, for each frame: iteratively performing, by using a looping process, steps of: determining a processing strategy from a plurality of processing strategies for calculating and quantizing the parameters; calculating and quantizing the parameters based on the determined processing strategy to obtain quantized parameters; and encoding the quantized parameters. In particular, each of the plurality of processing strategies comprises a respective first indication indicative of an ordering related to the calculation and quantization of individual parameters; and the processing strategy is determined based on at least one bitrate threshold.

    METHODS AND DEVICES FOR ENCODING AND/OR DECODING SPATIAL BACKGROUND NOISE WITHIN A MULTI-CHANNEL INPUT SIGNAL

    公开(公告)号:US20230215445A1

    公开(公告)日:2023-07-06

    申请号:US18000862

    申请日:2021-06-10

    CPC classification number: G10L19/008 G10L25/78 G10L21/0224

    Abstract: The present document describes a method (600) for encoding a multi-channel input signal (101) which comprises N different channels. The method (600) comprises, for a current frame of a sequence of frames, determining (601) whether the current frame is an active frame or an inactive frame using a signal and/or a voice activity detector, and determining (602) a downmix signal (103) based on the multi-channel input signal (101), wherein the downmix signal (103) comprises N channels or less. In addition, the method (600) comprises determining (603) upmixing metadata (105) comprising a set of parameters for generating, based on the downmix signal (103), a reconstructed multi-channel signal (111) comprising N channels, wherein the upmixing metadata (105) is determined in dependence of whether the current frame is an active frame or an inactive frame. The method (600) further comprises encoding (604) the upmixing metadata (105) into a bitstream.

Patent Agency Ranking