-
公开(公告)号:US12112762B2
公开(公告)日:2024-10-08
申请号:US18456670
申请日:2023-08-28
发明人: Guillaume Fuchs , Jürgen Herre , Fabian Küch , Stefan Döhla , Markus Multrus , Oliver Thiergart , Oliver Wübbolt , Florin Ghido , Stefan Bayer , Wolfgang Jaegers
IPC分类号: G10L19/008 , G10L19/02 , G10L19/032 , G10L19/038 , G10L19/16 , G10L19/26 , H03M7/30
CPC分类号: G10L19/008 , G10L19/0204 , G10L19/032 , G10L19/038 , G10L19/167 , G10L19/26 , H03M7/3082 , H03M7/6005 , H03M7/6011
摘要: An apparatus for encoding directional audio coding parameters including diffuseness parameters and direction parameters includes: a parameter calculator for calculating the diffuseness parameters with a first time or frequency resolution and for calculating the direction parameters with a second time or frequency resolution; and a quantizer and encoder processor for generating a quantized and encoded representation of the diffuseness parameters and the direction parameters.
-
公开(公告)号:US20240304198A1
公开(公告)日:2024-09-12
申请号:US18570904
申请日:2022-07-05
申请人: ORANGE
发明人: Stéphane Ragot , Mohamed Yaoumi
IPC分类号: G10L19/035 , G10L19/008 , G10L19/038
CPC分类号: G10L19/035 , G10L19/008 , G10L19/038
摘要: A method for encoding an input point on an n-dimensional sphere by encoding n-1 spherical coordinates of said input point. The method includes sequential scalar quantization of the n-1 spherical coordinates in order to obtain at most 2n-2 candidates at the end of the sequential scalar quantization of the n-1 coordinates, and subsequently selecting the best candidate which minimizes a distance between the input point and the at most 2n-2 candidates, and determining the separate quantization indices resulting from the sequential scalar quantization of the spherical coordinates of the best candidate and sequentially encoding the separate quantization indices of the best candidate. A corresponding decoding method, an encoding device and a decoding device are also provided.
-
3.
公开(公告)号:US12051429B2
公开(公告)日:2024-07-30
申请号:US18138684
申请日:2023-04-24
IPC分类号: G10L19/008 , G10L19/002 , G10L19/038 , H04R5/00
CPC分类号: G10L19/038 , G10L19/002 , H04R5/00 , G10L19/008 , H04R2430/21 , H04S2420/11
摘要: A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are configured to apply one adaptive network, based on a constraint that includes preservation of a spatial direction of one or more audio sources in the soundfield at the different time segments, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint. The one or more processors are also configured to apply an additional adaptive network.
-
公开(公告)号:US11978464B2
公开(公告)日:2024-05-07
申请号:US17757122
申请日:2021-01-22
申请人: GOOGLE LLC
IPC分类号: G10L19/00 , G10L19/038 , G10L19/04 , G10L21/02 , G06N3/02
CPC分类号: G10L19/038 , G10L19/04 , G10L21/02 , G06N3/02 , G10L19/00
摘要: A method includes receiving sampled audio data corresponding to utterances and training a machine learning (ML) model, using the sampled audio data, to generate a high-fidelity audio stream from a low bitrate input bitstream. The training of the ML model includes de-emphasizing the influence of low-probability distortion events in the sampled audio data on the trained ML model, where the de-emphasizing of the distortion events is achieved by the inclusion of a term in an objective function of the ML model, which term encourages low-variance predictive distributions of a next sample in the sampled audio data, based on previous samples of the audio data.
-
公开(公告)号:US20240144943A1
公开(公告)日:2024-05-02
申请号:US18473791
申请日:2023-09-25
发明人: Woo-taek LIM , Seung Kwon BEACK , Inseon JANG , Jongmo SUNG , Tae Jin LEE , Byeongho CHO , Minje KIM , Darius Petermann
IPC分类号: G10L19/038 , G10L25/18
CPC分类号: G10L19/038 , G10L25/18
摘要: An audio signal encoding/decoding method and an apparatus for performing the same are disclosed. The audio signal encoding method includes obtaining a full-band input signal, extracting a first feature vector corresponding to a first sub-band signal and a second feature vector corresponding to a second sub-band signal using an encoder neural network including a plurality of encoding layers, generating a first code vector corresponding to the first feature vector and a second code vector corresponding to the second feature vector by compressing the first feature vector and the second feature vector, and generating a bitstream by quantizing the first code vector and the second code vector.
-
公开(公告)号:US20240079020A1
公开(公告)日:2024-03-07
申请号:US18464986
申请日:2023-09-11
发明人: Florin GHIDO , Andreas NIEDERMEIER
IPC分类号: G10L19/06 , G10L19/00 , G10L19/02 , G10L19/028 , G10L19/032 , G10L19/038 , G10L21/038
CPC分类号: G10L19/06 , G10L19/00 , G10L19/02 , G10L19/0204 , G10L19/028 , G10L19/032 , G10L19/038 , G10L21/038
摘要: An improved concept for coding sample values of a spectral envelope is obtained by combining spectrotemporal prediction on the one hand and context-based entropy coding the residuals, on the other hand, while particularly determining the context for a current sample value dependent on a measure of a deviation between a pair of already coded/decoded sample values of the spectral envelope in a spectrotemporal neighborhood of the current sample value. The combination of the spectrotemporal prediction on the one hand and the context-based entropy coding of the prediction residuals with selecting the context depending on the deviation measure on the other hand harmonizes with the nature of spectral envelopes.
-
公开(公告)号:US11922966B2
公开(公告)日:2024-03-05
申请号:US17276256
申请日:2019-10-01
发明人: Hiroshi Sawada
IPC分类号: G10L21/028 , G06F17/16 , G10L21/0308 , G10L19/038
CPC分类号: G10L21/0308 , G06F17/16 , G10L19/038 , G10L21/028
摘要: A signal separation device for acquiring a source signal from a mixed signal observed by a plurality of sensors includes: a database that stores feature information of a clean signal; separation matrix calculation means for repeatedly performing processes of, based on a separated signal obtained by multiplication of a mixed signal converted into a time-frequency representation by a separation matrix and on the feature information stored in the database, calculating a parameter to be used for an objective function for optimizing the separation matrix, and calculating a separation matrix for minimizing the objective function using the parameter; and output means for outputting a separated signal calculated using the optimized separation matrix obtained by the separation matrix calculation means.
-
公开(公告)号:US20240013796A1
公开(公告)日:2024-01-11
申请号:US18474997
申请日:2023-09-26
发明人: Woo-taek LIM , Seung Kwon BEACK , Inseon JANG , Jongmo SUNG , Tae Jin LEE , Byeongho CHO , Minje KIM , Haici YANG
IPC分类号: G10L19/038
CPC分类号: G10L19/038
摘要: A method of encoding a speech signal includes predicting a feature vector of each of a plurality of frames included in the speech signal based on a ground-truth feature vector of a previous frame of each of the plurality of frames, calculating a residual signal corresponding to each of the plurality of frames based on a ground-truth feature vector of each of the plurality of frames and a predicted feature vector of each of the plurality of frames, and generating a bitstring corresponding to each of the plurality of frames by quantizing the residual signal.
-
9.
公开(公告)号:US20230410819A1
公开(公告)日:2023-12-21
申请号:US18456670
申请日:2023-08-28
发明人: Guillaume FUCHS , Jürgen HERRE , Fabian KÜCH , Stefan DÖHLA , Markus MULTRUS , Oliver THIERGART , Oliver WÜBBOLT , Florin GHIDO , Stefan BAYER , Wolfgang JAEGERS
IPC分类号: G10L19/008 , G10L19/038 , G10L19/16 , G10L19/26 , G10L19/02 , G10L19/032 , H03M7/30
CPC分类号: G10L19/008 , G10L19/038 , G10L19/167 , G10L19/26 , H03M7/6011 , G10L19/032 , H03M7/3082 , H03M7/6005 , G10L19/0204
摘要: An apparatus for encoding directional audio coding parameters including diffuseness parameters and direction parameters includes: a parameter calculator for calculating the diffuseness parameters with a first time or frequency resolution and for calculating the direction parameters with a second time or frequency resolution; and a quantizer and encoder processor for generating a quantized and encoded representation of the diffuseness parameters and the direction parameters.
-
公开(公告)号:US20230395084A1
公开(公告)日:2023-12-07
申请号:US18451975
申请日:2023-08-18
IPC分类号: G10L19/008 , G10L19/038 , G10L19/06
CPC分类号: G10L19/008 , G10L19/038 , G10L19/06 , G10L19/07
摘要: An encoding method includes determining an adaptive broadening factor based on a quantized line spectral frequency (LSF) vector of a first channel of a current frame of an audio signal and an LSF vector of a second channel of the current frame, and writing the quantized LSF vector and the adaptive broadening factor into a bitstream.
-
-
-
-
-
-
-
-
-