-
公开(公告)号:US07286982B2
公开(公告)日:2007-10-23
申请号:US10894854
申请日:2004-07-20
申请人: Allen Gersho , Vladimir Cuperman , Tian Wang , Kazuhito Koishida
发明人: Allen Gersho , Vladimir Cuperman , Tian Wang , Kazuhito Koishida
IPC分类号: G10L19/12
CPC分类号: G10L19/173 , G10L19/087
摘要: An enhanced low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.
摘要翻译: 一种增强的低比特率参数语音编码器,其将来自基础帧的声码器(例如MELP)的多个帧分组成超帧结构。 从底层帧中提取参数并量化到超帧中,这允许在不增加失真的情况下减少底层编码的比特率。 然后,可以将在超帧结构中编码的语音数据直接合成为语音,或者将其转码为格式,使得基础的基于帧的声码器进行合成。 超帧结构包括附加的错误检测和校正数据,以减少由位错误的通信引起的失真。
-
公开(公告)号:US06647366B2
公开(公告)日:2003-11-11
申请号:US10032642
申请日:2001-12-28
申请人: Tian Wang , Kazuhito Koishida , Vladimir Cuperman
发明人: Tian Wang , Kazuhito Koishida , Vladimir Cuperman
IPC分类号: G10L1900
CPC分类号: G10L19/22
摘要: A method and a system are provided for controlling the coding rates of a multimode coding system with respect to a sequence of input audio signal frames. The method eliminates or minimizes the overflow and underflow of a bit-stream buffer maintained by the coding system for temporarily recording bit-stream data prior to transmission or storage.
摘要翻译: 提供了一种用于控制多模式编码系统相对于输入音频信号帧序列的编码速率的方法和系统。 该方法消除或最小化由编码系统维护的位流缓冲器的溢出和下溢,以在传输或存储之前临时记录位流数据。
-
公开(公告)号:US07315815B1
公开(公告)日:2008-01-01
申请号:US09401068
申请日:1999-09-22
申请人: Allen Gersho , Vladimir Cuperman , Tian Wang , Kazuhito Koishida
发明人: Allen Gersho , Vladimir Cuperman , Tian Wang , Kazuhito Koishida
IPC分类号: G10L19/12
CPC分类号: G10L19/173 , G10L19/087
摘要: An enhanced low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.
摘要翻译: 一种增强的低比特率参数语音编码器,其将来自基础帧的声码器(例如MELP)的多个帧分组成超帧结构。 从底层帧中提取参数并量化到超帧中,这允许在不增加失真的情况下减少底层编码的比特率。 然后,可以将在超帧结构中编码的语音数据直接合成为语音,或者将其转码为格式,使得基础的基于帧的声码器进行合成。 超帧结构包括附加的错误检测和校正数据,以减少由位错误的通信引起的失真。
-
公开(公告)号:US20050075869A1
公开(公告)日:2005-04-07
申请号:US10894854
申请日:2004-07-20
申请人: Allen Gersho , Vladimir Cuperman , Tian Wang , Kazuhito Koishida
发明人: Allen Gersho , Vladimir Cuperman , Tian Wang , Kazuhito Koishida
CPC分类号: G10L19/173 , G10L19/087
摘要: An enhanced_low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.
摘要翻译: 增强型低比特率参数语音编码器,其将来自诸如MELP的基于帧的声码器的多个帧分组成超帧结构。 从底层帧中提取参数并量化到超帧中,这允许在不增加失真的情况下减少底层编码的比特率。 然后,可以将在超帧结构中编码的语音数据直接合成为语音,或者将其转码为格式,使得基础的基于帧的声码器进行合成。 超帧结构包括附加的错误检测和校正数据,以减少由位错误的通信引起的失真。
-
公开(公告)号:US06658383B2
公开(公告)日:2003-12-02
申请号:US09892105
申请日:2001-06-26
IPC分类号: G10L1902
CPC分类号: G10L19/18 , G10L19/0212 , G10L19/04
摘要: The present invention provides a transform coding method efficient for music signals that is suitable for use in a hybrid codec, whereby a common Linear Predictive (LP) synthesis filter is employed for both speech and music signals. The LP synthesis filter switches between a speech excitation generator and a transform excitation generator, in accordance with the coding of a speech or music signal, respectively. For coding speech signals, the conventional CELP technique may be used, while a novel asymmetrical overlap-add transform technique is applied for coding music signals. In performing the common LP synthesis filtering, interpolation of the LP coefficients is conducted for signals in overlap-add operation regions. The invention enables smooth transitions when the decoder switches between speech and music decoding modes.
摘要翻译: 本发明提供一种对于适用于混合编解码器的音乐信号有效的变换编码方法,由此对语音和音乐信号采用公共的线性预测(LP)合成滤波器。 LP合成滤波器分别根据语音或音乐信号的编码在语音激励发生器和变换激励发生器之间切换。 对于编码语音信号,可以使用传统的CELP技术,同时应用新的非对称重叠加法变换技术来编码音乐信号。 在执行公共LP合成滤波时,对重叠运算区域中的信号进行LP系数的插值。 当解码器在语音和音乐解码模式之间切换时,本发明实现平滑过渡。
-
6.
公开(公告)号:US07904293B2
公开(公告)日:2011-03-08
申请号:US11973689
申请日:2007-10-09
申请人: Tian Wang , Kazuhito Koishida , Hosam A. Khalil , Xiaoqin Sun , Wei-Ge Chen
发明人: Tian Wang , Kazuhito Koishida , Hosam A. Khalil , Xiaoqin Sun , Wei-Ge Chen
IPC分类号: G10L15/00
CPC分类号: G10L19/005 , G10L19/12 , G10L2019/0005
摘要: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.
摘要翻译: 描述与音频信息的编码和解码相关的技术和工具。 例如,用于解码当前帧的冗余编码信息包括仅与先前帧的一部分相关联的信号历史信息。 作为另一示例,用于对已编码单元进行解码的冗余编码信息包括仅当前一编码单元不可用时才将用于解码当前编码单元的码本级的参数。 作为另一示例,编码音频单元各自包括指示编码单元是否包括表示音频信号的段的主编码信息的字段,以及编码单元是否包括用于解码主编码信息的冗余编码信息。
-
公开(公告)号:US07454332B2
公开(公告)日:2008-11-18
申请号:US10869467
申请日:2004-06-15
申请人: Kazuhito Koishida , Feng Zhuge , Hosam A. Khalil , Tian Wang , Wei-ge Chen
发明人: Kazuhito Koishida , Feng Zhuge , Hosam A. Khalil , Tian Wang , Wei-ge Chen
CPC分类号: G10L21/0208 , G10L21/0232
摘要: A gain-constrained noise suppression for speech more precisely estimates noise, including during speech, to reduce musical noise artifacts introduced from noise suppression. The noise suppression operates by applying a spectral gain G(m, k) to each short-time spectrum value S(m, k) of a speech signal, where m is the frame number and k is the spectrum index. The spectrum values are grouped into frequency bins, and a noise characteristic estimated for each bin classified as a “noise bin.” An energy parameter is smoothed in both the time domain and the frequency domain to improve noise estimation per bin. The gain factors G(m, k) are calculated based on the current signal spectrum and the noise estimation, then smoothed before being applied to the signal spectral values S(m, k). First, a noisy factor is computed based on a ratio of the number of noise bins to the total number of bins for the current frame, where a zero-valued noisy factor means only using constant gain for all the spectrum values and noisy factor of one means no smoothing at all. Then, this noisy factor is used to alter the gain factors, such as by cutting off the high frequency components of the gain factors in the frequency domain.
摘要翻译: 用于语音的增益约束噪声抑制更精确地估计包括在语音期间的噪声,以减少从噪声抑制引入的音乐噪声伪像。 通过对语音信号的每个短时间频谱值S(m,k)应用频谱增益G(m,k)来进行噪声抑制,其中m是帧号,k是频谱索引。 频谱值被分组成频率仓,并且对于被分类为“噪声仓”的每个仓估计的噪声特性。 能量参数在时域和频域均被平滑,以改善每个bin的噪声估计。 基于当前信号频谱和噪声估计来计算增益因子G(m,k),然后在施加到信号频谱值S(m,k)之前进行平滑处理。 首先,基于噪声箱数与当前帧的总数的比率来计算噪声因子,其中零值噪声因子意味着仅对所有频谱值使用恒定增益并且噪声因子为1 意味着没有平滑。 然后,这种噪声因子用于改变增益因子,例如通过切断频域中增益因子的高频分量。
-
8.
公开(公告)号:US07734465B2
公开(公告)日:2010-06-08
申请号:US11973690
申请日:2007-10-09
申请人: Tian Wang , Kazuhito Koishida , Hosam A. Khalil , Xiaoqin Sun , Wei-Ge Chen
发明人: Tian Wang , Kazuhito Koishida , Hosam A. Khalil , Xiaoqin Sun , Wei-Ge Chen
IPC分类号: G10L21/00
CPC分类号: G10L19/005 , G10L19/12 , G10L2019/0005
摘要: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.
摘要翻译: 描述与音频信息的编码和解码相关的技术和工具。 例如,用于解码当前帧的冗余编码信息包括仅与先前帧的一部分相关联的信号历史信息。 作为另一示例,用于对已编码单元进行解码的冗余编码信息包括仅当前一编码单元不可用时才将用于解码当前编码单元的码本级的参数。 作为另一示例,编码音频单元各自包括指示编码单元是否包括表示音频信号的段的主编码信息的字段,以及编码单元是否包括用于解码主编码信息的冗余编码信息。
-
公开(公告)号:US07590531B2
公开(公告)日:2009-09-15
申请号:US11197792
申请日:2005-08-04
申请人: Hosam A. Khalil , Tian Wang , Kazuhito Koishida , Xiaoqin Sun , Wei-Ge Chen
发明人: Hosam A. Khalil , Tian Wang , Kazuhito Koishida , Xiaoqin Sun , Wei-Ge Chen
IPC分类号: G10L21/02
CPC分类号: G10L21/045 , G10L19/005
摘要: Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.
摘要翻译: 描述与延迟或丢失的编码音频信息相关的技术和工具。 例如,基于一个或多个因素选择一个或多个缺失帧的隐藏技术,所述一个或多个因素包括一个或多个缺失帧附近的一个或多个可用帧中的每一个的分类。 作为另一示例,来自隐藏信号的信息用于产生在对后续帧进行解码时所依赖的替代信息。 作为另一示例,使用具有与接收到的分组延迟相对应的节点的数据结构来确定期望的解码器分组延迟值。
-
公开(公告)号:US20080040105A1
公开(公告)日:2008-02-14
申请号:US11973689
申请日:2007-10-09
申请人: Tian Wang , Kazuhito Koishida , Hosam Khalil , Xiaoqin Sun , Wei-Ge Chen
发明人: Tian Wang , Kazuhito Koishida , Hosam Khalil , Xiaoqin Sun , Wei-Ge Chen
IPC分类号: G10L19/12
CPC分类号: G10L19/005 , G10L19/12 , G10L2019/0005
摘要: Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.
-
-
-
-
-
-
-
-
-