专利检索 ipc:"G10L19/04" 第 1 页

1.

发明公开
TRUNCATEABLE PREDICTIVE CODING 审中-公开

公开(公告)号：US20240347066A1

公开(公告)日：2024-10-17

申请号：US18643227

申请日：2024-04-23

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Erik NORVELL , Fredrik JANSSON

IPC分类号： G10L19/012 , G10L19/00 , G10L19/008 , G10L19/032 , G10L19/04 , G10L19/06 , H04W76/28

CPC分类号： G10L19/012 , G10L19/0017 , G10L19/008 , G10L19/032 , G10L19/04 , G10L19/06 , H04W76/28

摘要： A method, system, and computer program to encode and decode a channel coherence parameter applied on a frequency band basis, where the coherence parameters of each frequency band form a coherence vector. The coherence vector is encoded and decoded using a predictive scheme followed by a variable bit rate entropy coding.

2.

发明公开
APPARATUS AND METHOD FOR ENCODING AN AUDIO SIGNAL USING A COMPENSATION VALUE 审中-公开

公开(公告)号：US20240221765A1

公开(公告)日：2024-07-04

申请号：US18604374

申请日：2024-03-13

申请人： Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

发明人： Sascha DISCH , Franz REUTELHUBER , Jan BÜTHE , Markus MULTRUS , Bernd EDLER

IPC分类号： G10L19/02 , G10L19/008 , G10L19/04 , G10L19/16 , G10L21/0232 , G10L21/038 , H04L65/70

CPC分类号： G10L19/02 , G10L21/0232 , G10L21/038 , H04L65/70 , G10L19/008 , G10L19/04 , G10L19/16

摘要： An apparatus for encoding an audio signal includes: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder includes: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value.

3.

发明授权
Audio encoder for encoding an audio signal, method for encoding an audio signal and computer program under consideration of a detected peak spectral region in an upper frequency band 有权

公开(公告)号：US12014747B2

公开(公告)日：2024-06-18

申请号：US18308293

申请日：2023-04-27

申请人： Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

发明人： Markus Multrus , Christian Neukam , Markus Schnell , Benjamin Schubert

IPC分类号： G10L19/26 , G10L19/02 , G10L19/028 , G10L19/03 , G10L19/032 , G10L19/04 , G10L19/12 , G10L19/16 , G10L21/007 , G10L21/02 , G10L21/0208 , G10L21/0324 , G10L21/038 , G10L25/15 , G10L25/18

CPC分类号： G10L19/265 , G10L19/0204 , G10L19/03 , G10L19/032 , G10L19/12 , G10L19/16 , G10L19/26 , G10L21/007 , G10L21/02 , G10L21/0208 , G10L21/0324 , G10L25/15 , G10L25/18 , G10L19/02 , G10L19/028 , G10L19/04 , G10L21/038

摘要： An audio encoder for encoding an audio signal having a lower frequency band and an upper frequency band includes: a detector for detecting a peak spectral region in the upper frequency band of the audio signal; a shaper for shaping the lower frequency band using shaping information for the lower band and for shaping the upper frequency band using at least a portion of the shaping information for the lower band, wherein the shaper is configured to additionally attenuate spectral values in the detected peak spectral region in the upper frequency band; and a quantizer and coder stage for quantizing a shaped lower frequency band and a shaped upper frequency band and for entropy coding quantized spectral values from the shaped lower frequency band and the shaped upper frequency band.

4.

发明授权
Trained generative model speech coding 有权

公开(公告)号：US11978464B2

公开(公告)日：2024-05-07

申请号：US17757122

申请日：2021-01-22

申请人： GOOGLE LLC

发明人： Willem Bastiaan Kleijn , Andrew Storus

IPC分类号： G10L19/00 , G10L19/038 , G10L19/04 , G10L21/02 , G06N3/02

CPC分类号： G10L19/038 , G10L19/04 , G10L21/02 , G06N3/02 , G10L19/00

摘要： A method includes receiving sampled audio data corresponding to utterances and training a machine learning (ML) model, using the sampled audio data, to generate a high-fidelity audio stream from a low bitrate input bitstream. The training of the ML model includes de-emphasizing the influence of low-probability distortion events in the sampled audio data on the trained ML model, where the de-emphasizing of the distortion events is achieved by the inclusion of a term in an objective function of the ML model, which term encourages low-variance predictive distributions of a next sample in the sampled audio data, based on previous samples of the audio data.

5.

发明授权
Truncateable predictive coding 有权

公开(公告)号：US11978460B2

公开(公告)日：2024-05-07

申请号：US17817251

申请日：2022-08-03

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Erik Norvell , Fredrik Jansson

IPC分类号： G10L19/012 , G10L19/00 , G10L19/008 , G10L19/032 , G10L19/04 , G10L19/06 , H04W76/28

CPC分类号： G10L19/012 , G10L19/0017 , G10L19/008 , G10L19/032 , G10L19/04 , G10L19/06 , H04W76/28

摘要： A method, system, and computer program to encode and decode a channel coherence parameter applied on a frequency band basis, where the coherence parameters of each frequency band form a coherence vector. The coherence vector is encoded and decoded using a predictive scheme followed by a variable bit rate entropy coding.

6.

发明授权
Audio encoder and decoder using a frequency domain processor with full-band gap filling and a time domain processor 有权

公开(公告)号：US11929084B2

公开(公告)日：2024-03-12

申请号：US18158035

申请日：2023-01-23

申请人： Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

发明人： Sascha Disch , Martin Dietz , Markus Multrus , Guillaume Fuchs , Emmanuel Ravelli , Matthias Neusinger , Markus Schnell , Benjamin Schubert , Bernhard Grill

IPC分类号： G10L19/18 , G10L19/02 , G10L19/028 , G10L19/032 , G10L19/04 , G10L19/06 , G10L19/24 , G10L19/26 , G10L21/038 , G10L19/20

CPC分类号： G10L19/18 , G10L19/028 , G10L19/032 , G10L19/06 , G10L19/265 , G10L19/02 , G10L19/04 , G10L19/20 , G10L19/24 , G10L21/038

摘要： An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal having a first encoded signal portion for the first audio signal portion and a second encoded signal portion for the second portion.

7.

发明公开
SUPPORT FOR GENERATION OF COMFORT NOISE 审中-公开

公开(公告)号：US20240055008A1

公开(公告)日：2024-02-15

申请号：US18383953

申请日：2023-10-26

申请人： Telefonaktiebolaget LM Ericsson (publ)

发明人： Fredrik Jansson , Erik Norvell , Tomas Jansson Toftgård

IPC分类号： G10L19/012 , G10L19/008 , G10L19/032 , G10L19/04 , G10L19/06 , G10L19/00

CPC分类号： G10L19/012 , G10L19/008 , G10L19/032 , G10L19/04 , G10L19/06 , G10L19/0017 , H04W76/28

摘要： A method and a transmitting node for supporting generation of comfort noise for at least two audio channels at a receiving node. The method is performed by a transmitting node. The method comprises determining spectral characteristics of audio signals on at least two input audio channels and determining a spatial coherence between the audio signals. The spatial coherence is associated with perceptual importance measures. A compressed representation of the spatial coherence is determined per frequency band by weighting the spatial coherence within each frequency band according to the perceptual importance measures. Information about the spectral characteristics and the compressed representation of the spatial coherence per frequency band is signaled to the receiving node for enabling the generation of the comfort noise at the receiving node.

8.

发明公开
ADAPTIVE BLOCK SWITCHING WITH DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20230386486A1

公开(公告)日：2023-11-30

申请号：US18248294

申请日：2021-10-15

申请人： Dolby Laboratories Licensing Corporation

发明人： Cong ZHOU , Grant A. DAVIDSON , Mark S. VINTON

IPC分类号： G10L19/022 , G10L25/30 , G10L19/032 , G10L19/04

CPC分类号： G10L19/022 , G10L19/04 , G10L19/032 , G10L25/30

摘要： The present invention relates to a method for predicting transform coefficients representing frequency content of an adaptive block length media signal, by receiving a frame and receiving block length information indicating a number of quantized transform coefficients for each block in the frame, the number of quantized transform coefficients being one of a first or second number, wherein the first number is greater than the second number, determining a first block has the second number of quantized transform coefficients, converting the first block into a converted block having the first number of quantized transform coefficients, conditioning a main neural network trained to predict at least one output variable given at least one conditioning variable, the at least one conditioning variable being based on information regarding the converted block and block length information for the first block, providing at least one predicted transform coefficients from an output stage of the main neural network.

9.

发明公开
METHOD FOR ENCODING AND DECODING AUDIO SIGNAL USING NORMALIZING FLOW, AND TRAINING METHOD THEREOF 审中-公开

公开(公告)号：US20230298603A1

公开(公告)日：2023-09-21

申请号：US18150126

申请日：2023-01-04

申请人： ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

发明人： In Seon JANG , Seung Kwon BEACK , Jong Mo SUNG , Tae Jin LEE , Woo Taek LIM , Byeong Ho CHO

IPC分类号： G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01

CPC分类号： G10L19/032 , G10L25/30 , G10L19/04 , G06N7/01

摘要： A method for encoding an input signal using N flow blocks (N is a natural number greater than or equal to 2) and (N−1) split block(s), which is performed by a processor, may comprise: transmitting, by a k-th flow block (k is a natural number greater than or equal to 1 and less than or equal to N−1) among the N flow blocks, a k-th transformation signal obtained by transforming a received signal into a latent representation to a k-th split block among the (N−1) split block(s); splitting, by the k-th split block, the k-th transformation signal by a predetermined ratio, into a first split signal and a second split signal; transmitting, by the k-th split block, the first split signal to a (k+1)-th flow block; and quantizing a signal transformed by an N-th flow block and the second split signals using a quantization block.

10.

发明授权
Apparatus for encoding and decoding of integrated speech and audio 有权

公开(公告)号：US11705137B2

公开(公告)日：2023-07-18

申请号：US16925946

申请日：2020-07-10

申请人： ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Kwangwoon University Industry-Academic Collaboration Foundation

发明人： Tae Jin Lee , Seung-Kwon Baek , Min Je Kim , Dae Young Jang , Jeongil Seo , Kyeongok Kang , Jin-Woo Hong , Hochong Park , Young-Cheol Park

IPC分类号： G10L19/02 , G10L19/008 , G10L19/04 , G10L19/20 , G10L19/12 , G10L19/00

CPC分类号： G10L19/008 , G10L19/02 , G10L19/04 , G10L19/12 , G10L19/20 , G10L19/00

摘要： Provided is an encoding apparatus for integrally encoding and decoding a speech signal and a audio signal, and may include: an input signal analyzer to analyze a characteristic of an input signal; a stereo encoder to down mix the input signal to a mono signal when the input signal is a stereo signal, and to extract stereo sound image information; a frequency band expander to expand a frequency band of the input signal; a sampling rate converter to convert a sampling rate; a speech signal encoder to encode the input signal using a speech encoding module when the input signal is a speech characteristics signal; a audio signal encoder to encode the input signal using a audio encoding module when the input signal is a audio characteristic signal; and a bitstream generator to generate a bitstream.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类