-
公开(公告)号:US10861475B2
公开(公告)日:2020-12-08
申请号:US15775000
申请日:2016-10-27
Applicant: DOLBY INTERNATIONAL AB
Inventor: Arijit Biswas
IPC: G10L19/20 , G10L19/26 , G10L21/034 , H03G7/00 , G10L25/51
Abstract: Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A method of processing an audio signal comprises receiving an audio signal, classifying the audio signal as one of pure sinusoidal, hybrid, or pure transient signal using two defined threshold values, and selectively applying a companding operation by switching between a companding off mode, a companding on mode, and an average companding mode, comprising selecting between the companding on mode and the average companding mode for a classified hybrid signal using a companding rule that uses a temporal sharpness measure in a frequency domain.
-
公开(公告)号:US10424305B2
公开(公告)日:2019-09-24
申请号:US15533625
申请日:2015-12-08
Applicant: DOLBY INTERNATIONAL AB
Inventor: Arijit Biswas , Tobias Friedrich , Klaus Peichl
IPC: G10L19/005 , G10L19/02
Abstract: An error-concealing audio decoding method comprises: receiving a packet comprising a set of MDCT coefficients encoding a frame of time-domain samples of an audio signal; identifying the received packet as erroneous; generating estimated MDCT coefficients to replace the set of MDCT coefficients of the erroneous packet, based on corresponding MDCT coefficients associated with a received packet directly preceding the erroneous packet; assigning signs of a first subset of MDCT coefficients of the estimated MDCT coefficients, wherein the first subset comprises such MDCT coefficients that are associated with tonal-like spectral bins, to coincide with signs of corresponding MDCT coefficients of said preceding packet; randomly assigning signs of a second subset of MDCT coefficients of the estimated MDCT coefficients, wherein the second subset comprises MDCT coefficients associated with noise-like spectral bins; replacing the erroneous packet by a concealment packet containing the estimated MDCT coefficients and the signs assigned.
-
公开(公告)号:US20190057694A1
公开(公告)日:2019-02-21
申请号:US15998796
申请日:2018-08-16
Applicant: Dolby International AB
Inventor: Arijit Biswas
IPC: G10L15/22 , G10L25/78 , G10L15/24 , G10L21/0364 , G06F3/01
Abstract: The present disclosure relates to methods for processing a decoded audio signal and for selectively applying speech/dialog enhancement to the decoded audio signal. The present disclosure also relates to a method of operating a headset for computer-mediated reality. A method of processing a decoded audio signal comprises obtaining a measure of a cognitive load of a listener that listens to a rendering of the audio signal, determining whether speech/dialog enhancement shall be applied based on the obtained measure of the cognitive load, and performing speech/dialog enhancement based on the determination. A method of operating a headset for computer-mediated reality comprises obtaining eye-tracking data of a wearer of the headset, determining a measure of a cognitive load of the wearer of the headset based on the eye-tracking data, and outputting an indication of the cognitive load of the wearer of the headset. The present disclosure further relates to corresponding apparatus and systems, and to methods of operating such apparatus and systems.
-
公开(公告)号:US09852722B2
公开(公告)日:2017-12-26
申请号:US15118044
申请日:2015-02-18
Applicant: DOLBY INTERNATIONAL AB
Inventor: Arijit Biswas
IPC: G10H1/40 , G10H7/00 , G10H1/00 , G10L19/008 , G10L19/16 , G10L25/03 , G10L19/022
CPC classification number: G10H1/0008 , G10H1/40 , G10H2210/076 , G10L19/008 , G10L19/022 , G10L19/167 , G10L25/03
Abstract: The invention relates to estimating tempo information directly from a bitstream encoding audio information, preferably music. Said tempo information is derived from at least one periodicity derived from a detection of at least two onsets included in the audio information. Such onsets are detected via a detection of long to short block transitions (in the bitstream) or/and via a detection of a changing bit allocation (change of cost) regarding encoding/transmitting the exponents of transform coefficients encoded in the bitstream.
-
公开(公告)号:US09697840B2
公开(公告)日:2017-07-04
申请号:US14359697
申请日:2012-11-28
Applicant: DOLBY INTERNATIONAL AB
Inventor: Arijit Biswas , Marco Fink , Michael Schug
IPC: G10L19/02 , G10L19/038 , G10L25/54 , G10H1/00 , G10H1/38 , G10L19/022 , G10L21/0388
CPC classification number: G10L19/02 , G10H1/0008 , G10H1/383 , G10H2210/066 , G10H2250/225 , G10L19/022 , G10L19/038 , G10L21/0388 , G10L25/54
Abstract: The present document relates to methods and systems for music information retrieval (MIR). In particular, the present document relates to methods and systems for extracting a chroma vector from an audio signal. A method (900) for determining a chroma vector (100) for a block of samples of an audio signal (301) is described. The method (900) comprises receiving (901) a corresponding block of frequency coefficients derived from the block of samples of the audio signal (301) from a core encoder (412) of a spectral band replication based audio encoder (410) adapted to generate an encoded bitstream (305) of the audio signal (301) from the block of frequency coefficients; and determining (904) the chroma vector (100) for the block of samples of the audio signal (301) based on the received block of frequency coefficients.
-
26.
公开(公告)号:US20250069616A1
公开(公告)日:2025-02-27
申请号:US18948274
申请日:2024-11-14
Inventor: Per Hedelin , Arijit Biswas , Michael Schug , Vinay Melkote
IPC: G10L21/0232 , G10L19/008 , G10L19/02 , G10L19/032 , G10L21/034 , G10L25/18 , G10L25/45 , H03G7/00 , H04B1/66
Abstract: Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to the substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation.
-
公开(公告)号:US11929085B2
公开(公告)日:2024-03-12
申请号:US17270053
申请日:2019-08-29
Inventor: Arijit Biswas , Jia Dai , Aaron Steven Master
IPC: G10L19/24
CPC classification number: G10L19/24
Abstract: Described herein is a method of low-bitrate coding of audio data and generating enhancement metadata for controlling audio enhancement of the low-bitrate coded audio data at a decoder side, including the steps of: (a) core encoding original audio data at a low bitrate to obtain encoded audio data; (b) generating enhancement metadata to be used for controlling a type and/or amount of audio enhancement at the decoder side after core decoding the encoded audio data; and (c) outputting the encoded audio data and the enhancement metadata. Described is further an encoder configured to perform said method. Described is moreover a method for generating enhanced audio data from low-bitrate coded audio data based on enhancement metadata and a decoder configured to perform said method.
-
公开(公告)号:US20220392458A1
公开(公告)日:2022-12-08
申请号:US17770035
申请日:2020-10-16
Inventor: Janusz Klejsa , Arijit Biswas , Lars Villemoes , Roy M. Fejgin , Cong Zhou
IPC: G10L19/00
Abstract: Described herein is a method of waveform decoding, the method including the steps of: (a) receiving, by a waveform decoder, a bitstream including a finite bitrate representation of a source signal; (b) waveform decoding the finite bitrate representation of the source signal to obtain a waveform approximation of the source signal; (c) providing the waveform approximation of the source signal to a generative model that implements a probability density function, to obtain a probability distribution for a reconstructed signal of the source signal; and (d) generating the reconstructed signal of the source signal based on the probability distribution. Described are further a method and system for waveform coding and a method of training a generative model.
-
29.
公开(公告)号:US11423923B2
公开(公告)日:2022-08-23
申请号:US16892180
申请日:2020-06-03
Inventor: Per Hedelin , Arijit Biswas , Michael Schug , Vinay Melkote
IPC: G10L19/00 , G10L21/00 , G10L21/0232 , H04B1/66 , G10L25/18 , G10L21/034 , G10L19/008 , G10L19/02 , G10L19/032 , H03G7/00 , G10L25/45
Abstract: Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to the substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation.
-
30.
公开(公告)号:US10679639B2
公开(公告)日:2020-06-09
申请号:US16457726
申请日:2019-06-28
Inventor: Per Hedelin , Arijit Biswas , Michael Schug , Vinay Melkote
IPC: G10L19/00 , G10L21/0232 , H04B1/66 , G10L25/18 , G10L21/034 , G10L19/008 , G10L19/02 , G10L19/032 , H03G7/00 , G10L25/45
Abstract: Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to the substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation.