-
公开(公告)号:US20240420714A1
公开(公告)日:2024-12-19
申请号:US18819733
申请日:2024-08-29
Inventor: Nikolaus RETTELBACH , Bernhard GRILL , Guillaume FUCHS , Stefan GEYERSBERGER , Markus MULTRUS , Harald POPP , Juergen HERRE , Stefan WABNIK , Gerald SCHULLER , Jens HIRSCHFELD
IPC: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/028 , G10L19/032 , G10L25/18
Abstract: An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error.
A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.-
公开(公告)号:US12165663B2
公开(公告)日:2024-12-10
申请号:US17986477
申请日:2022-11-14
Applicant: Google LLC
Inventor: Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi
IPC: G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18
Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.
-
公开(公告)号:US12094477B2
公开(公告)日:2024-09-17
申请号:US18318443
申请日:2023-05-16
Applicant: DOLBY INTERNATIONAL AB
Inventor: Lars Villemoes , Heiko Purnhagen , Per Ekstrand
IPC: G10L19/16 , G10L19/035 , G10L19/24 , G10L21/038
CPC classification number: G10L19/167 , G10L19/035 , G10L19/24 , G10L21/038
Abstract: Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided.
-
公开(公告)号:US12080305B2
公开(公告)日:2024-09-03
申请号:US18522732
申请日:2023-11-29
Inventor: Nikolaus Rettelbach , Bernhard Grill , Guillaume Fuchs , Stefan Geyersberger , Markus Multrus , Harald Popp , Juergen Herre , Stefan Wabnik , Gerald Schuller , Jens Hirschfeld
IPC: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/028 , G10L19/032 , G10L25/18
CPC classification number: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/0204 , G10L19/028 , G10L19/032 , G10L25/18
Abstract: An encoder for providing an audio stream on the basis of a transform- domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error. A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.
-
公开(公告)号:US12057133B2
公开(公告)日:2024-08-06
申请号:US18312853
申请日:2023-05-05
Inventor: Jan Buethe , Conrad Benndorf , Manfred Lutzky , Markus Schnell , Maximilian Schlegel
IPC: H03M13/00 , G10L19/022 , G10L19/035 , G10L21/0324 , H03M13/07 , H03M13/15 , H04B17/309 , H04L1/00
CPC classification number: G10L19/035 , G10L19/022 , G10L21/0324 , H03M13/07 , H03M13/1515 , H04B17/309 , H04L1/0009 , H04L1/0032 , H04L1/0042 , H04L1/0045 , H04L1/0046 , H04L1/0084
Abstract: A channel encoder for encoding a frame includes a multi-mode redundancy encoder for redundancy encoding the frame in accordance with a certain coding mode from a set of different coding modes, wherein the coding modes are different from each other with respect to an amount of redundancy added to the frame, wherein the multi-mode redundancy encoder is configured to output a coded frame including at least one code word; and a colorator for applying a coloration sequence to the at least one code word; wherein the coloration sequence is such that at least one bit of the code word is changed by the application of the at least one of coloration sequence, wherein the specific coloration sequence is selected in accordance with the certain coding mode.
-
6.
公开(公告)号:US12009002B2
公开(公告)日:2024-06-11
申请号:US17400422
申请日:2021-08-12
Inventor: Adrian Tomasek , Ralph Sperschneider , Jan Büthe , Alexander Tschekalinskij , Manfred Lutzky
IPC: G10L19/00 , G10L19/022 , G10L19/035 , G10L21/0324 , H03M13/07 , H03M13/15 , H04B17/309 , H04L1/00
CPC classification number: G10L19/035 , G10L19/022 , G10L21/0324 , H03M13/07 , H03M13/1515 , H04B17/309 , H04L1/0009 , H04L1/0032 , H04L1/0042 , H04L1/0045 , H04L1/0046 , H04L1/0084
Abstract: An audio transmitter processor for generating an error protected frame using encoded audio data of an audio frame, the encoded audio data for the audio frame having a first amount of information units and a second amount of information units, has: a frame builder for building a codeword frame having a codeword raster, wherein the frame builder is configured to determine a border between a first amount of information units and a second amount of information units so that a starting information unit of the second amount of information units coincides with a codeword border; and an error protection coder to obtain a plurality of processed codewords representing the error protected frame.
-
公开(公告)号:US11961513B2
公开(公告)日:2024-04-16
申请号:US17388845
申请日:2021-07-29
Applicant: Massachusetts Institute of Technology
Inventor: Michael R. Price , James R. Glass , Anantha P. Chandrakasan
IPC: G10L15/16 , G06F1/3228 , G06N3/063 , G10L15/06 , G10L15/14 , G10L15/28 , G10L19/035 , G10L25/90
CPC classification number: G10L15/16 , G06F1/3228 , G06N3/063 , G10L15/063 , G10L15/14 , G10L15/142 , G10L15/285 , G10L19/035 , G10L25/90 , G10L2015/0633
Abstract: A decoder includes a feature extraction circuit for calculating one or more feature vectors. An acoustic model circuit is coupled to receive one or more feature vectors from and assign one or more likelihood values to the one or more feature vectors. A memory architecture that utilizes on-chip state lattices and an off-chip memory for storing states of transition of the decoder is used to reduce reading and writing to the off-chip memory. The on-chip state lattice is populated with at least one of the states of transition stored in the off-chip memory. An on-chip word is generated from a snapshot from the on-chip state lattice. The on-chip state lattice and the on-chip word lattice act as an on-chip cache to reduce reading and writing to the off-chip memory.
-
8.
公开(公告)号:US20240096337A1
公开(公告)日:2024-03-21
申请号:US18522732
申请日:2023-11-29
Inventor: Nikolaus RETTELBACH , Bernhard GRILL , Guillaume FUCHS , Stefan GEYERSBERGER , Markus MULTRUS , Harald POPP , Juergen HERRE , Stefan WABNIK , Gerald SCHULLER , Jens HIRSCHFELD
IPC: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/028 , G10L19/032
CPC classification number: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/0204 , G10L19/028 , G10L19/032 , G10L25/18
Abstract: An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error.
A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.-
公开(公告)号:US11790925B2
公开(公告)日:2023-10-17
申请号:US17255191
申请日:2019-06-20
Applicant: Sony Corporation
Inventor: Mitsuyuki Hatanaka , Toru Chinen , Minoru Tsuji , Hiroyuki Honma , Yuki Yamamoto
IPC: G10L19/035 , G10L21/00
CPC classification number: G10L19/035 , G10L21/00
Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount.
The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.-
公开(公告)号:US20230253000A1
公开(公告)日:2023-08-10
申请号:US18013217
申请日:2021-06-25
Applicant: Sony Group Corporation
Inventor: Akifumi Kono , Toru Chinen , Hiroyuki Honma , Minoru Tsuji , Yoshiaki Oikawa
IPC: G10L19/035 , G10L19/008
CPC classification number: G10L19/035 , G10L19/008
Abstract: The present technology relates to a signal processing device, a signal processing method, and a program which are capable of improving encoding efficiency.
The signal processing device includes a correction unit configured to correct an audio signal of an audio object based on a gain value included in metadata of the audio object, and a quantization unit configured to calculate auditory psychological parameters based on a signal obtained by the correction and to quantize the audio signal. The present technology can be applied to an encoding device.
-
-
-
-
-
-
-
-
-