Patent search ipc:"G10L19/035" Page 1

1.

发明申请
AUDIO ENCODER, AUDIO DECODER, METHODS FOR ENCODING AND DECODING AN AUDIO SIGNAL, AUDIO STREAM AND A COMPUTER PROGRAM 有权

公开(公告)号：US20240420714A1

公开(公告)日：2024-12-19

申请号：US18819733

申请日：2024-08-29

Applicant: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.

Inventor： Nikolaus RETTELBACH , Bernhard GRILL , Guillaume FUCHS , Stefan GEYERSBERGER , Markus MULTRUS , Harald POPP , Juergen HERRE , Stefan WABNIK , Gerald SCHULLER , Jens HIRSCHFELD

IPC: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/028 , G10L19/032 , G10L25/18

Abstract: An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error.
A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.

2.

发明授权
Self-supervised audio representation learning for mobile devices 有权

公开(公告)号：US12165663B2

公开(公告)日：2024-12-10

申请号：US17986477

申请日：2022-11-14

Applicant: Google LLC

Inventor： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC: G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18

Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

3.

发明授权
Decoding audio bitstreams with enhanced spectral band replication metadata in at least one fill element 有权

公开(公告)号：US12094477B2

公开(公告)日：2024-09-17

申请号：US18318443

申请日：2023-05-16

Applicant: DOLBY INTERNATIONAL AB

Inventor： Lars Villemoes , Heiko Purnhagen , Per Ekstrand

IPC: G10L19/16 , G10L19/035 , G10L19/24 , G10L21/038

CPC classification number: G10L19/167 , G10L19/035 , G10L19/24 , G10L21/038

Abstract: Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided.

4.

发明授权
Audio encoder, audio decoder, methods for encoding and decoding an audio signal, audio stream and a computer program 有权

公开(公告)号：US12080305B2

公开(公告)日：2024-09-03

申请号：US18522732

申请日：2023-11-29

Applicant: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.

Inventor： Nikolaus Rettelbach , Bernhard Grill , Guillaume Fuchs , Stefan Geyersberger , Markus Multrus , Harald Popp , Juergen Herre , Stefan Wabnik , Gerald Schuller , Jens Hirschfeld

IPC: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/028 , G10L19/032 , G10L25/18

CPC classification number: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/0204 , G10L19/028 , G10L19/032 , G10L25/18

Abstract: An encoder for providing an audio stream on the basis of a transform- domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error. A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.

5.

发明授权
Multi-mode channel coding 有权

公开(公告)号：US12057133B2

公开(公告)日：2024-08-06

申请号：US18312853

申请日：2023-05-05

Applicant: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventor： Jan Buethe , Conrad Benndorf , Manfred Lutzky , Markus Schnell , Maximilian Schlegel

IPC: H03M13/00 , G10L19/022 , G10L19/035 , G10L21/0324 , H03M13/07 , H03M13/15 , H04B17/309 , H04L1/00

CPC classification number: G10L19/035 , G10L19/022 , G10L21/0324 , H03M13/07 , H03M13/1515 , H04B17/309 , H04L1/0009 , H04L1/0032 , H04L1/0042 , H04L1/0045 , H04L1/0046 , H04L1/0084

Abstract: A channel encoder for encoding a frame includes a multi-mode redundancy encoder for redundancy encoding the frame in accordance with a certain coding mode from a set of different coding modes, wherein the coding modes are different from each other with respect to an amount of redundancy added to the frame, wherein the multi-mode redundancy encoder is configured to output a coded frame including at least one code word; and a colorator for applying a coloration sequence to the at least one code word; wherein the coloration sequence is such that at least one bit of the code word is changed by the application of the at least one of coloration sequence, wherein the specific coloration sequence is selected in accordance with the certain coding mode.

6.

发明授权
Audio transmitter processor, audio receiver processor and related methods and computer programs 有权

公开(公告)号：US12009002B2

公开(公告)日：2024-06-11

申请号：US17400422

申请日：2021-08-12

Applicant: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventor： Adrian Tomasek , Ralph Sperschneider , Jan Büthe , Alexander Tschekalinskij , Manfred Lutzky

IPC: G10L19/00 , G10L19/022 , G10L19/035 , G10L21/0324 , H03M13/07 , H03M13/15 , H04B17/309 , H04L1/00

CPC classification number: G10L19/035 , G10L19/022 , G10L21/0324 , H03M13/07 , H03M13/1515 , H04B17/309 , H04L1/0009 , H04L1/0032 , H04L1/0042 , H04L1/0045 , H04L1/0046 , H04L1/0084

Abstract: An audio transmitter processor for generating an error protected frame using encoded audio data of an audio frame, the encoded audio data for the audio frame having a first amount of information units and a second amount of information units, has: a frame builder for building a codeword frame having a codeword raster, wherein the frame builder is configured to determine a border between a first amount of information units and a second amount of information units so that a starting information unit of the second amount of information units coincides with a codeword border; and an error protection coder to obtain a plurality of processed codewords representing the error protected frame.

7.

发明授权
Low-power automatic speech recognition device 有权

公开(公告)号：US11961513B2

公开(公告)日：2024-04-16

申请号：US17388845

申请日：2021-07-29

Applicant: Massachusetts Institute of Technology

Inventor： Michael R. Price , James R. Glass , Anantha P. Chandrakasan

IPC: G10L15/16 , G06F1/3228 , G06N3/063 , G10L15/06 , G10L15/14 , G10L15/28 , G10L19/035 , G10L25/90

CPC classification number: G10L15/16 , G06F1/3228 , G06N3/063 , G10L15/063 , G10L15/14 , G10L15/142 , G10L15/285 , G10L19/035 , G10L25/90 , G10L2015/0633

Abstract: A decoder includes a feature extraction circuit for calculating one or more feature vectors. An acoustic model circuit is coupled to receive one or more feature vectors from and assign one or more likelihood values to the one or more feature vectors. A memory architecture that utilizes on-chip state lattices and an off-chip memory for storing states of transition of the decoder is used to reduce reading and writing to the off-chip memory. The on-chip state lattice is populated with at least one of the states of transition stored in the off-chip memory. An on-chip word is generated from a snapshot from the on-chip state lattice. The on-chip state lattice and the on-chip word lattice act as an on-chip cache to reduce reading and writing to the off-chip memory.

8.

发明公开
AUDIO ENCODER, AUDIO DECODER, METHODS FOR ENCODING AND DECODING AN AUDIO SIGNAL, AUDIO STREAM AND A COMPUTER PROGRAM 审中-公开

公开(公告)号：US20240096337A1

公开(公告)日：2024-03-21

申请号：US18522732

申请日：2023-11-29

Applicant: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.

Inventor： Nikolaus RETTELBACH , Bernhard GRILL , Guillaume FUCHS , Stefan GEYERSBERGER , Markus MULTRUS , Harald POPP , Juergen HERRE , Stefan WABNIK , Gerald SCHULLER , Jens HIRSCHFELD

IPC: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/028 , G10L19/032

CPC classification number: G10L19/035 , G10L19/008 , G10L19/02 , G10L19/0204 , G10L19/028 , G10L19/032 , G10L25/18

Abstract: An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error.
A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.

9.

发明授权
Information processing device and method, and program 有权

公开(公告)号：US11790925B2

公开(公告)日：2023-10-17

申请号：US17255191

申请日：2019-06-20

Applicant: Sony Corporation

Inventor： Mitsuyuki Hatanaka , Toru Chinen , Minoru Tsuji , Hiroyuki Honma , Yuki Yamamoto

IPC: G10L19/035 , G10L21/00

CPC classification number: G10L19/035 , G10L21/00

Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount.
The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.

10.

发明公开
SIGNAL PROCESSING DEVICE, SIGNAL PROCESSING METHOD, AND PROGRAM 审中-公开

公开(公告)号：US20230253000A1

公开(公告)日：2023-08-10

申请号：US18013217

申请日：2021-06-25

Applicant: Sony Group Corporation

Inventor： Akifumi Kono , Toru Chinen , Hiroyuki Honma , Minoru Tsuji , Yoshiaki Oikawa

IPC: G10L19/035 , G10L19/008

CPC classification number: G10L19/035 , G10L19/008

Abstract: The present technology relates to a signal processing device, a signal processing method, and a program which are capable of improving encoding efficiency.
The signal processing device includes a correction unit configured to correct an audio signal of an audio object based on a gain value included in metadata of the audio object, and a quantization unit configured to calculate auditory psychological parameters based on a signal obtained by the correction and to quantize the audio signal. The present technology can be applied to an encoding device.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification