Method, apparatus, and computer program product for categorical spatial analysis-synthesis on spectrum of multichannel audio signals
    51.
    发明授权
    Method, apparatus, and computer program product for categorical spatial analysis-synthesis on spectrum of multichannel audio signals 有权
    用于分类空间分析的方法,装置和计算机程序产品 - 多声道音频信号的频谱合成

    公开(公告)号:US09420375B2

    公开(公告)日:2016-08-16

    申请号:US14039357

    申请日:2013-09-27

    摘要: A method, apparatus and computer program product are therefore provided according to an example embodiment of the present invention in order to perform categorical analysis and synthesis of a multichannel signal to synthesize binaural signals and extract, separate, and manipulate components within the audio scene of the multichannel signal that were captured through multichannel audio means. In the context of a method, a multichannel signal is received. The method may include computing the spectrum for the multichannel signal, determining tonality of bands within the spectrum, and generating a band structure for the spectrum. The method may also include performing spatial analysis of the bands, performing source filtering using the bands, performing synthesis on the filtered band components, and generating an output signal. A corresponding apparatus and a computer program product are also provided.

    摘要翻译: 因此,根据本发明的示例性实施例,提供了一种方法,装置和计算机程序产品,以便执行多通道信号的分类分析和合成,以合成双耳信号并提取,分离和操纵音频场景内的组件 通过多声道音频装置捕获的多声道信号。 在方法的上下文中,接收多声道信号。 该方法可以包括计算多信道信号的频谱,确定频谱内的频带的音调,以及产生频谱的频带结构。 该方法还可以包括执行频带的空间分析,使用频带执行源滤波,对经滤波的频带分量执行合成,以及产生输出信号。 还提供了相应的装置和计算机程序产品。

    CODING AND DECODING OF SPECTRAL PEAK POSITIONS
    52.
    发明申请
    CODING AND DECODING OF SPECTRAL PEAK POSITIONS 有权
    光谱位置的编码和解码

    公开(公告)号:US20160225378A1

    公开(公告)日:2016-08-04

    申请号:US14402406

    申请日:2014-10-10

    IPC分类号: G10L19/02 G10L19/00 G10L19/22

    摘要: A coder and decoder, and methods therein, are provided for coding and decoding of spectral peak positions in audio coding. According to a first aspect, an audio signal segment coding method is provided for coding of spectral peak positions. The method comprises determining which one out of two lossless spectral peak position coding schemes that requires the least number of bits to code the spectral peak positions of an audio signal segment; and selecting the spectral peak position coding scheme that requires the least number of bits to code the spectral peak positions of the audio signal segment. A first one of the two lossless spectral peak position coding schemes is suitable for periodic or semi-periodic spectral peak position distributions; and a second one of two lossless spectral peak position coding schemes is suitable for sparse spectral peak position distributions.

    摘要翻译: 提供编码器和解码器及其方法,用于对音频编码中的频谱峰位置进行编码和解码。 根据第一方面,提供了用于对频谱峰位置进行编码的音频信号段编码方法。 该方法包括确定需要最少位数的两个无损频谱峰值位置编码方案中的哪一个编码音频信号段的频谱峰值位置; 以及选择需要最少位数来编码音频信号段的频谱峰值位置的频谱峰值位置编码方案。 两个无损光谱峰值位置编码方案中的第一个适用于周期或半周期光谱峰位置分布; 并且两个无损光谱峰值位置编码方案中的第二个适用于稀疏光谱峰位置分布。

    Pitch filter for audio signals
    55.
    发明授权
    Pitch filter for audio signals 有权
    音频信号的滤波器

    公开(公告)号:US09343077B2

    公开(公告)日:2016-05-17

    申请号:US14936408

    申请日:2015-11-09

    摘要: In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.

    摘要翻译: 在一些实施例中,公开了用于滤波从音频比特流产生的初步音频信号的音调滤波器。 音调滤波器具有从以下之一中选择的操作模式:(i)使用滤波信息对初步音频信号进行滤波以获得滤波的音频信号的有源模式,以及(ii)禁用音调滤波器的非活动模式。 在具有从至少两个不同编码模式中选择的编码模式的音频编码器或音频解码器中产生初步音频信号,并且在编码中操作时,音调滤波器能够选择性地在活动模式或非活动模式下操作 模式基于控制信息。

    Adaptive gain reduction for encoding a speech signal
    57.
    发明授权
    Adaptive gain reduction for encoding a speech signal 有权
    用于对语音信号进行编码的自适应增益减小

    公开(公告)号:US09269365B2

    公开(公告)日:2016-02-23

    申请号:US12218242

    申请日:2008-07-11

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    摘要: There is provided a method of encoding an input speech signal. The method comprises identifying a fixed codebook vector from a fixed codebook; identifying an adaptive codebook vector from a adaptive codebook; calculating an adaptive codebook gain; reducing the adaptive codebook gain by an amount; optimally selecting a fixed codebook gain based on the adaptive codebook gain while both the fixed codebook vector and the adaptive codebook vector remain fixed; and converting the input speech signal into an encoded speech using the fixed codebook gain, the adaptive codebook gain, the fixed codebook vector and the adaptive codebook vector. The amount of reducing the adaptive codebook gain may be varied.

    摘要翻译: 提供了一种对输入语音信号进行编码的方法。 该方法包括从固定码本识别固定码本向量; 从自适应码本识别自适应码本向量; 计算自适应码本增益; 将自适应码本增益减少一定量; 在固定码本矢量和自适应码本矢量保持固定的同时,基于自适应码本增益最优选择固定码本增益; 以及使用固定码本增益,自适应码本增益,固定码本矢量和自适应码本矢量将输入语音信号转换为编码语音。 降低自适应码本增益的量可以变化。

    Method for Predicting High Frequency Band Signal, Encoding Device, and Decoding Device
    58.
    发明申请
    Method for Predicting High Frequency Band Signal, Encoding Device, and Decoding Device 有权
    预测高频带信号,编码装置和解码装置的方法

    公开(公告)号:US20150332699A1

    公开(公告)日:2015-11-19

    申请号:US14808145

    申请日:2015-07-24

    IPC分类号: G10L19/20 G10L21/038

    摘要: A method includes obtaining a signal type of an audio signal and a low frequency band signal of the audio signal, where the audio signal includes the low frequency band signal and a high frequency band signal; obtaining a frequency envelope of the high frequency band signal according to the signal type; predicting an excitation signal of the high frequency band signal according to the low frequency band signal; and restoring the high frequency band signal according to the frequency envelope of the high frequency band signal and the excitation signal of the high frequency band signal. By using the technical solutions of the embodiments of the present invention, an error existing between a high frequency band signal obtained by prediction and an actual high frequency band signal can be effectively reduced, and an accuracy rate of the predicted high frequency band signal can be increased.

    摘要翻译: 一种方法包括获得音频信号的信号类型和音频信号的低频带信号,其中音频信号包括低频带信号和高频带信号; 根据信号类型获得高频信号的频率包络; 根据低频带信号预测高频信号的激励信号; 以及根据高频带信号的频率包络和高频带信号的激励信号恢复高频带信号。 通过使用本发明的实施例的技术方案,可以有效地减少通过预测获得的高频带信号与实际的高频带信号之间存在的误差,并且预测的高频带信号的精度率可以是 增加。

    CLOSED LOOP QUANTIZATION OF HIGHER ORDER AMBISONIC COEFFICIENTS
    59.
    发明申请
    CLOSED LOOP QUANTIZATION OF HIGHER ORDER AMBISONIC COEFFICIENTS 有权
    闭环式定量更高级别的健康系数

    公开(公告)号:US20150332681A1

    公开(公告)日:2015-11-19

    申请号:US14712638

    申请日:2015-05-14

    摘要: In general, techniques are described for closed loop quantization of HOA coefficients that provide a three-dimensional representation of the sound field. An audio encoding device may perform closed loop quantization of an audio object based at least in part on a result of performing quantization of directional information associated with the audio object. An audio decoding device may obtain an audio object that has been closed loop quantized based at least in part on a result of performing quantization of directional information associated with the audio object, and may dequantize the audio object.

    摘要翻译: 通常,描述了提供声场的三维表示的HOA系数的闭环量化的技术。 音频编码装置可以至少部分地基于与音频对象相关联的方向信息的量化的结果来执行音频对象的闭环量化。 音频解码装置可以至少部分地基于与音频对象相关联的方向信息的量化的结果来获得已被闭环量化的音频对象,并且可以对音频对象进行去量化。

    Adaptive codebook gain control for speech coding

    公开(公告)号:US09190066B2

    公开(公告)日:2015-11-17

    申请号:US12321934

    申请日:2009-01-26

    申请人: Yang Gao

    发明人: Yang Gao

    摘要: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.