Encoding of periodic speech using prototype waveforms
    1.
    发明授权
    Encoding of periodic speech using prototype waveforms 有权
    使用原型波形编码周期性语音

    公开(公告)号:US06456964B2

    公开(公告)日:2002-09-24

    申请号:US09217494

    申请日:1998-12-21

    IPC分类号: G10L1904

    摘要: A method and apparatus for coding a quasi-periodic speech signal. The speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter. The residual signal is encoded by extracting a prototype period from a current frame of the residual signal. A first set of parameters is calculated which describes how to modify a previous prototype period to approximate the current prototype period. One or more codevectors are selected which, when summed, approximate the error between the current prototype period and the modified previous prototype. A multi-stage codebook is used to encode this error signal. A second set of parameters describe these selected codevectors. The decoder synthesizes an output speech signal by reconstructing a current prototype period based on the first and second set of parameters, and the previous reconstructed prototype period. The residual signal is then interpolated over the region between the current and previous reconstructed prototype periods. The decoder synthesizes output speech based on the interpolated residual signal.

    摘要翻译: 一种用于对准周期性语音信号进行编码的方法和装置。 语音信号由用线性预测编码(LPC)分析滤波器对语音信号进行滤波而产生的残留信号表示。 通过从残留信号的当前帧提取原型周期来编码残差信号。 计算第一组参数,描述如何修改先前的原型周期以近似当前原型周期。 选择一个或多个代码矢量,当相加时,近似当前原型周期与修改的先前原型之间的误差。 多级码本用于对该错误信号进行编码。 第二组参数描述这些选择的代码矢量。 解码器通过基于第一和第二组参数和先前重建的原型周期重建当前原型周期来合成输出语音信号。 然后,在当前和之前重建的原型周期之间的区域内插补余数信号。 解码器基于内插残差信号合成输出语音。

    Variable rate speech coding
    2.
    发明授权
    Variable rate speech coding 有权
    可变速率语音编码

    公开(公告)号:US07496505B2

    公开(公告)日:2009-02-24

    申请号:US11559274

    申请日:2006-11-13

    摘要: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode. The apparatus dynamically switches between these modes as the properties of the speech signal vary with time. And where appropriate, regions of speech are modeled as pseudo-random noise, resulting in a significantly lower bit rate. This coding is used in a dynamic fashion whenever unvoiced speech or background noise is detected.

    摘要翻译: 一种用于语音信号的可变速率编码的方法和装置。 输入语音信号被分类,并且基于该分类选择适当的编码模式。 对于每个分类,选择实现具有可接受的语音再现质量的最低比特率的编码模式。 低语音平均比特率是通过在语音的部分期间采用高保真模式(即,广泛适用于不同类型的语音的高比特率)来实现的,该可靠输出需要该保真度。 在这些模式产生可接受输出的语音部分期间使用较低比特率模式。 输入语音信号分为有源和非活动区域。 有源区域进一步分为语音,清音和瞬态区域。 取决于所需的保真级别,各种编码模式被应用于活动语音。 可以根据每个特定模式的优点和缺点来利用编码模式。 当语音信号的属性随时间变化时,该装置动态地在这些模式之间切换。 并且在适当的地方,语言区域被建模为伪随机噪声,导致显着较低的比特率。 当检测到无声语音或背景噪声时,该编码以动态方式使用。

    VARIABLE RATE SPEECH CODING
    3.
    发明申请
    VARIABLE RATE SPEECH CODING 有权
    可变速率语音编码

    公开(公告)号:US20070179783A1

    公开(公告)日:2007-08-02

    申请号:US11559274

    申请日:2006-11-13

    IPC分类号: G10L19/00

    摘要: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode. The apparatus dynamically switches between these modes as the properties of the speech signal vary with time. And where appropriate, regions of speech are modeled as pseudo-random noise, resulting in a significantly lower bit rate. This coding is used in a dynamic fashion whenever unvoiced speech or background noise is detected.

    摘要翻译: 一种用于语音信号的可变速率编码的方法和装置。 输入语音信号被分类,并且基于该分类选择适当的编码模式。 对于每个分类,选择实现具有可接受的语音再现质量的最低比特率的编码模式。 低语音平均比特率是通过在语音的部分期间采用高保真模式(即,广泛适用于不同类型的语音的高比特率)来实现的,该可靠输出需要该保真度。 在这些模式产生可接受输出的语音部分期间使用较低比特率模式。 输入语音信号分为有源和非活动区域。 有源区域进一步分为语音,清音和瞬态区域。 取决于所需的保真级别,各种编码模式被应用于活动语音。 可以根据每个特定模式的优点和缺点来利用编码模式。 当语音信号的属性随时间变化时,该装置动态地在这些模式之间切换。 并且在适当的地方,语言区域被建模为伪随机噪声,导致显着较低的比特率。 当检测到无声语音或背景噪声时,该编码以动态方式使用。

    Variable rate speech coding
    4.
    发明授权

    公开(公告)号:US07136812B2

    公开(公告)日:2006-11-14

    申请号:US10713758

    申请日:2003-11-14

    IPC分类号: G10L19/12 G10L21/00 G10L19/14

    摘要: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode. The apparatus dynamically switches between these modes as the properties of the speech signal vary with time. And where appropriate, regions of speech are modeled as pseudo-random noise, resulting in a significantly lower bit rate. This coding is used in a dynamic fashion whenever unvoiced speech or background noise is detected.

    Multiple mode variable rate speech coding
    5.
    发明授权
    Multiple mode variable rate speech coding 有权
    多模式可变速率语音编码

    公开(公告)号:US06691084B2

    公开(公告)日:2004-02-10

    申请号:US09217341

    申请日:1998-12-21

    IPC分类号: G10L1912

    摘要: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode. The apparatus dynamically switches between these modes as the properties of the speech signal vary with time. And where appropriate, regions of speech are modeled as pseudo-random noise, resulting in a significantly lower bit rate. This coding is used in a dynamic fashion whenever unvoiced speech or background noise is detected.

    摘要翻译: 一种用于语音信号的可变速率编码的方法和装置。 输入语音信号被分类,并且基于该分类选择适当的编码模式。 对于每个分类,选择实现具有可接受的语音再现质量的最低比特率的编码模式。 低语音平均比特率是通过在语音的部分期间采用高保真模式(即,广泛适用于不同类型的语音的高比特率)来实现的,该可靠输出需要该保真度。 在这些模式产生可接受输出的语音部分期间使用较低比特率模式。 输入语音信号分为有源和非活动区域。 有源区域进一步分为语音,清音和瞬态区域。 取决于所需的保真级别,各种编码模式被应用于活动语音。 可以根据每个特定模式的优点和缺点来利用编码模式。 当语音信号的属性随时间变化时,该装置动态地在这些模式之间切换。 并且在适当的地方,语言区域被建模为伪随机噪声,导致显着较低的比特率。 当检测到无声语音或背景噪声时,该编码以动态方式使用。

    Adaptive intra-refresh for digital video encoding
    6.
    发明授权
    Adaptive intra-refresh for digital video encoding 有权
    适用于数字视频编码的内部刷新

    公开(公告)号:US08948266B2

    公开(公告)日:2015-02-03

    申请号:US11025297

    申请日:2004-12-28

    摘要: An adaptive Intra-refresh (IR) technique for digital video encoding adjusts IR rate based on video content, or a combination of video content and channel condition. The IR rate may be applied at the frame level or macroblock (MB) level. At the frame level, the IR rate specifies the percentage of MBs to be Intra-coded within the frame. At the MB level, the IR rate defines a statistical probability that a particular MB is to be Intra-coded. The IR rate is adjusted in proportion to a combined metric that weighs estimated channel loss probability, frame-to-frame variation, and texture information. The IR rate can be determined using a close-form solution that requires relatively low implementation complexity. For example, such a close-form does not require iteration or an exhaustive search. In addition, the IR rate can be determined from parameters that are available before motion estimation and compensation are performed.

    摘要翻译: 用于数字视频编码的自适应内部刷新(IR)技术基于视频内容或视频内容和频道条件的组合来调整IR速率。 可以在帧级或宏块(MB)级应用IR速率。 在帧级别,IR速率指定帧内帧内编码的百分比。 在MB级别,IR率定义了特定MB被内部编码的统计概率。 IR速率与重量估计的信道丢失概率,帧到帧变化和纹理信息的组合度量成比例地调整。 IR速率可以使用需要较低实现复杂度的紧密形式的解决方案来确定。 例如,这种关闭形式不需要迭代或穷尽搜索。 另外,可以在执行运动估计和补偿之前可用的参数来确定IR速率。

    3D video encoding
    7.
    发明授权
    3D video encoding 有权
    3D视频编码

    公开(公告)号:US08594180B2

    公开(公告)日:2013-11-26

    申请号:US11677335

    申请日:2007-02-21

    IPC分类号: G06F21/00

    摘要: A stereo 3D video frame includes left and right components that are combined to produce a stereo image. For a given amount of distortion, the left and right components may have different impacts on perceptual visual quality of the stereo image due to asymmetry in the distortion response of the human eye. A 3D video encoder adjusts an allocation of coding bits between left and right components of the 3D video based on a frame-level bit budget and a weighting between the left and right components. The video encoder may generate the bit allocation in the rho (ρ) domain. The weighted bit allocation may be derived based on a quality metric that indicates overall quality produced by the left and right components. The weighted bit allocation compensates for the asymmetric distortion response to reduce overall perceptual distortion in the stereo image and thereby enhance or maintain visual quality.

    摘要翻译: 立体3D视频帧包括组合以产生立体图像的左和右组件。 对于给定量的失真,由于人眼的失真响应的不对称,左和右分量可能对立体图像的感知视觉质量具有不同的影响。 3D视频编码器基于帧级比特预算和左右分量之间的加权来调整3D视频的左和右分量之间的编码比特的分配。 视频编码器可以在rho(rho)域中生成比特分配。 可以基于指示左组件和右组件产生的总体质量的质量度量来导出加权比特分配。 加权比特分配补偿非对称失真响应,以减少立体图像中的整体感知失真,从而增强或维持视觉质量。

    Methods of performing error concealment for digital video
    8.
    发明授权
    Methods of performing error concealment for digital video 有权
    对数字视频执行错误隐藏的方法

    公开(公告)号:US08379734B2

    公开(公告)日:2013-02-19

    申请号:US11690132

    申请日:2007-03-23

    IPC分类号: H04N7/68

    摘要: Error concealment is used to hide the effects of errors detected within digital video information. A complex error concealment mode decision is disclosed to determine whether spatial error concealment (SEC) or temporal error concealment (TEC) should be used. The error concealment mode decision system uses different methods depending on whether the damaged frame is an intra-frame or an inter-frame. If the video frame is an intra-frame then a similarity metric is used to determine if the intra-frame represents a scene-change or not. If the video frame is an intra-frame, a complex multi-termed equation is used to determine whether SEC or TEC should be used. A novel spatial error concealment technique is disclosed for use when the error concealment mode decision determines that spatial error concealment should be used for reconstruction. The novel spatial error concealment technique divides a corrupt macroblock into four different regions, a corner region, a row adjacent to the corner region, a column adjacent to the corner region, and a remainder main region. Those regions are then reconstructed in that order and information from earlier reconstructed regions may be used in later reconstructed regions. Finally, a macroblock refreshment technique is disclosed for preventing error propagation from harming non-corrupt inter-blocks. Specifically, an inter-macroblock may be ‘refreshed’ using spatial error concealment if there has been significant error caused damage that may cause the inter-block to propagate the errors.

    摘要翻译: 错误隐藏用于隐藏数字视频信息中检测到的错误的影响。 公开了一种复杂的错误隐藏模式决定,以确定是否应使用空间误差隐藏(SEC)或时间误差隐藏(TEC)。 错误隐藏模式决策系统使用不同的方法,取决于损坏的帧是帧内还是帧间。 如果视频帧是帧内帧,则使用相似性度量来确定帧内是否表示场景改变。 如果视频帧是帧内帧,则使用复数多方程来确定是否应使用SEC或TEC。 当错误隐藏模式决定确定空间误差隐藏应用于重建时,公开了一种新颖的空间误差隐藏技术。 新颖的空间误差隐藏技术将腐败的宏块分为四个不同的区域,一个角区域,一个与拐角区域相邻的一行,一个邻近拐角区域的列以及一个剩余的主区域。 然后按照该顺序重建那些区域,并且可以在稍后的重建区域中使用来自较早重建区域的信息。 最后,公开了一种宏块刷新技术,用于防止错误传播损害非损坏的块间。 具体地,如果存在可能导致块间传播错误的严重错误引起的损坏,则可以使用空间错误隐藏来刷新宏块间宏块。

    Bandwidth-adaptive quantization
    10.
    发明授权
    Bandwidth-adaptive quantization 有权
    带宽自适应量化

    公开(公告)号:US08090577B2

    公开(公告)日:2012-01-03

    申请号:US10215533

    申请日:2002-08-08

    IPC分类号: G10L19/00

    摘要: Methods and apparatus are presented for determining the type of acoustic signal and the type of frequency spectrum exhibited by the acoustic signal in order to selectively delete parameter information before vector quantization. The bits that would otherwise be allocated to the deleted parameters can then be re-allocated to the quantization of the remaining parameters, which results in an improvement of the perceptual quality of the synthesized acoustic signal. Alternatively, the bits that would have been allocated to the deleted parameters are dropped, resulting in an overall bit-rate reduction.

    摘要翻译: 提出了用于确定声信号的类型和声信号显示的频谱的类型的方法和装置,以便在矢量量化之前选择性地删除参数信息。 否则将分配给删除的参数的位可以被重新分配给剩余参数的量化,这导致合成声信号的感知质量的改善。 或者,将分配给删除的参数的位将被丢弃,导致整体比特率降低。