ARBITRARY AVERAGE DATA RATES FOR VARIABLE RATE CODERS
    52.
    发明申请
    ARBITRARY AVERAGE DATA RATES FOR VARIABLE RATE CODERS 有权
    用于可变速率编码器的仲裁平均数据速率

    公开(公告)号:US20070171931A1

    公开(公告)日:2007-07-26

    申请号:US11625788

    申请日:2007-01-22

    IPC分类号: H04J3/17

    CPC分类号: G10L19/22 G10L19/24

    摘要: Methods and apparatus are provided for achieving an arbitrary average data rate for a variable rate coder. One method includes selecting a set (e.g., a pair) of initial composite rates surrounding the arbitrary average data rate. A reallocation fraction is then calculated based on the initial composite rates. The reallocation fraction is used to reassign a number of frames from one component rate of an initial composite rate to another in order to achieve the arbitrary average data rate. Such a method may be configured such that selecting an initial composite rate on one side of (e.g., less than) the arbitrary average data rate implicitly selects the initial composite rate on the other side of the arbitrary average data rate.

    摘要翻译: 提供了用于实现可变速率编码器的任意平均数据速率的方法和装置。 一种方法包括选择围绕任意平均数据速率的初始复合速率的集合(例如,一对)。 然后基于初始复合速率计算重新分配分数。 重新分配部分用于将多个帧从初始复合速率的一个分量速率重新分配给另一个,以便实现任意的平均数据速率。 这样的方法可以被配置为使得在任意平均数据速率的一侧(例如小于)选择初始复合速率隐含地选择任意平均数据速率的另一侧上的初始复合速率。

    Method and apparatus for subsampling phase spectrum information
    53.
    发明授权
    Method and apparatus for subsampling phase spectrum information 有权
    二次采样相位谱信息的方法和装置

    公开(公告)号:US06397175B1

    公开(公告)日:2002-05-28

    申请号:US09356491

    申请日:1999-07-19

    申请人: Sharath Manjunath

    发明人: Sharath Manjunath

    IPC分类号: G10L1104

    摘要: A method and apparatus for subsampling phase spectrum information includes a speech coder for analyzing and reconstructing a prototype of a frame by using intelligent subsampling of phase spectrum information of the prototype. To analyze the prototype, the speech coder produces a phase parameters of a reference prototype, generates phase parameters of a current prototype, and correlates the phase parameters of the current prototype with the phase parameters of the reference prototype in multiple frequency bands. To reconstruct the prototype using linear phase shift values, the speech coder produces a phase parameters of the reference prototype, generates a set of linear phase shift values associated with the prototype, and composes a phase vector from the phase parameters and the linear phase shift values across multiple frequency bands. To reconstruct the prototype using circular rotation values, the speech coder produces a set of circular rotation values associated with the prototype, generates a set of bandpass waveforms in multiple frequency bands, the bandpass waveforms being associated with the phase parameters of the reference prototype, and modifes the bandpass waveforms based upon the circular rotation values.

    摘要翻译: 用于对相位谱信息进行子采样的方法和装置包括:语音编码器,用于通过使用原型的相位谱信息的智能子采样来分析和重构帧的原型。 为了分析原型,语音编码器产生参考原型的相位参数,生成当前原型的相位参数,并将当前原型的相位参数与多个频带中参考原型的相位参数相关联。 为了使用线性相移值重构原型,语音编码器产生参考原型的相位参数,产生与原型相关联的一组线性相移值,并且从相位参数和线性相移值组成相位矢量 跨越多个频带。 为了使用循环旋转值重建原型,语音编码器产生与原型相关联的一组圆形旋转值,在多个频带中产生一组带通波形,带通波形与参考原型的相位参数相关联,以及 基于圆形旋转值修改带通波形。

    Method and apparatus for interleaving line spectral information quantization methods in a speech coder
    54.
    发明授权
    Method and apparatus for interleaving line spectral information quantization methods in a speech coder 有权
    用于在语音编码器中交织线谱信息量化方法的方法和装置

    公开(公告)号:US06393394B1

    公开(公告)日:2002-05-21

    申请号:US09356755

    申请日:1999-07-19

    IPC分类号: G10L2100

    摘要: A method and apparatus for interleaving line spectral information quantization methods in a speech coder includes quantizing line spectral information with two vector quantization techniques, the first technique being a non-moving-average prediction-based technique, and the second technique being a moving-average prediction-based technique. A line spectral information vector is vector quantized with the first technique. Equivalent moving average codevectors for the first technique are computed. A memory of a moving average codebook of codevectors is updated with the equivalent moving average codevectors for a predefined number of frames that were previously processed by the speech coder. A target quantization vector for the second technique is calculated based on the updated moving average codebook memory. The target quantization vector is vector quantized with the second technique to generate a quantized target codevector. The memory of the moving average codebook is updated with the quantized target codevector. Quantized line spectral information vectors are derived from the quantized target codevector.

    摘要翻译: 用于在语音编码器中交织线谱信息量化方法的方法和装置包括使用两个矢量量化技术量化线谱信息,第一技术是基于非移动平均预测的技术,第二技术是移动平均 基于预测的技术。 线谱信息矢量用第一技术进行矢量量化。 计算第一种技术的等效移动平均码矢量。 代码矢量的移动平均码本的存储器用先前由语音编码器处理的预定数量的帧的等效移动平均码向量更新。 基于更新的移动平均码本存储器计算第二技术的目标量化矢量。 目标量化矢量用第二技术进行矢量量化,以产生量化的目标码矢量。 用量化的目标码矢量来更新移动平均码本的存储器。 量化的线谱信息矢量从量化的目标码矢量导出。

    Video coding with fine granularity scalability using cycle-aligned fragments
    55.
    发明授权
    Video coding with fine granularity scalability using cycle-aligned fragments 有权
    使用循环对齐片段的细粒度可扩展性的视频编码

    公开(公告)号:US08233544B2

    公开(公告)日:2012-07-31

    申请号:US11776679

    申请日:2007-07-12

    IPC分类号: H04N7/12 H04N11/02

    CPC分类号: H04N19/34

    摘要: The disclosure describes FGS video coding techniques that use cycle-aligned fragments (CAFs). The techniques may perform cycle-based coding of FGS video data block coefficients and syntax elements, and encapsulate cycles in fragments for transmission. The fragments may be cycle-aligned such that a start of a payload of each of the fragments substantially coincides with a start of one of the cycles. In this manner, cycles can be readily accessed via individual fragments. Some cycles may be controlled with a vector mode to scan to a predefined position within a block before moving to another block. In this manner, the number of cycles can be reduced, reducing the number of fragments and associated overhead. The CAFs may be entropy coded independently of one another so that each fragment may be readily accessed and decoded without waiting for decoding of other fragments. Independent entropy coding may permit parallel decoding and simultaneous processing of fragments.

    摘要翻译: 本公开描述了使用循环对准片段(CAF)的FGS视频编码技术。 这些技术可以执行FGS视频数据块系数和语法元素的基于循环的编码,并且将循环封装成用于传输的片段。 片段可以是循环对齐的,使得每个片段的有效载荷的开始基本上与循环中的一个的开始重合。 以这种方式,可以容易地通过各个片段访问周期。 可以通过向量模式来控制一些周期,以便在移动到另一个块之前扫描到块内的预定位置。 以这种方式,可以减少周期数,减少片段的数量和相关的开销。 CAF可以彼此独立地进行熵编码,使得每个片段可以容易地被访问和解码,而不等待其他片段的解码。 独立熵编码可以允许并行解码和片段的同时处理。

    Sub-sampled excitation waveform codebooks
    56.
    发明授权
    Sub-sampled excitation waveform codebooks 有权
    次采样激励波形码本

    公开(公告)号:US07698132B2

    公开(公告)日:2010-04-13

    申请号:US10322245

    申请日:2002-12-17

    IPC分类号: G10L19/00 G10L19/12 G10L21/02

    摘要: Methods and apparatus are presented for reducing the number of bits needed to represent an excitation waveform. An acoustic signal in an analysis frame is analyzed to determine whether it is a band-limited signal. A sub-sampled sparse codebook is used to generate the excitation waveform if the acoustic signal is a band-limited signal. The sub-sampled sparse codebook is generated by decimating permissible pulse locations from the codebook track in accordance with the frequency characteristic of the acoustic signal.

    摘要翻译: 提出了用于减少表示激励波形所需的位数的方法和装置。 分析分析帧中的声信号以确定其是否是带限信号。 如果声信号是带限信号,则使用子采样稀疏码本来产生激励波形。 通过根据声信号的频率特性对来自码本磁道的允许脉冲位置进行抽取,产生子采样稀疏码本。

    Variable rate speech coding
    57.
    发明授权
    Variable rate speech coding 有权
    可变速率语音编码

    公开(公告)号:US07496505B2

    公开(公告)日:2009-02-24

    申请号:US11559274

    申请日:2006-11-13

    摘要: A method and apparatus for the variable rate coding of a speech signal. An input speech signal is classified and an appropriate coding mode is selected based on this classification. For each classification, the coding mode that achieves the lowest bit rate with an acceptable quality of speech reproduction is selected. Low average bit rates are achieved by only employing high fidelity modes (i.e., high bit rate, broadly applicable to different types of speech) during portions of the speech where this fidelity is required for acceptable output. Lower bit rate modes are used during portions of speech where these modes produce acceptable output. Input speech signal is classified into active and inactive regions. Active regions are further classified into voiced, unvoiced, and transient regions. Various coding modes are applied to active speech, depending upon the required level of fidelity. Coding modes may be utilized according to the strengths and weaknesses of each particular mode. The apparatus dynamically switches between these modes as the properties of the speech signal vary with time. And where appropriate, regions of speech are modeled as pseudo-random noise, resulting in a significantly lower bit rate. This coding is used in a dynamic fashion whenever unvoiced speech or background noise is detected.

    摘要翻译: 一种用于语音信号的可变速率编码的方法和装置。 输入语音信号被分类,并且基于该分类选择适当的编码模式。 对于每个分类,选择实现具有可接受的语音再现质量的最低比特率的编码模式。 低语音平均比特率是通过在语音的部分期间采用高保真模式(即,广泛适用于不同类型的语音的高比特率)来实现的,该可靠输出需要该保真度。 在这些模式产生可接受输出的语音部分期间使用较低比特率模式。 输入语音信号分为有源和非活动区域。 有源区域进一步分为语音,清音和瞬态区域。 取决于所需的保真级别,各种编码模式被应用于活动语音。 可以根据每个特定模式的优点和缺点来利用编码模式。 当语音信号的属性随时间变化时,该装置动态地在这些模式之间切换。 并且在适当的地方,语言区域被建模为伪随机噪声,导致显着较低的比特率。 当检测到无声语音或背景噪声时,该编码以动态方式使用。

    COMPLEXITY-ADAPTIVE 2D-TO-3D VIDEO SEQUENCE CONVERSION
    59.
    发明申请
    COMPLEXITY-ADAPTIVE 2D-TO-3D VIDEO SEQUENCE CONVERSION 失效
    复杂自适应2D到3D视频序列转换

    公开(公告)号:US20080150945A1

    公开(公告)日:2008-06-26

    申请号:US11615393

    申请日:2006-12-22

    IPC分类号: G06T15/10

    摘要: Techniques for complexity-adaptive and automatic two-dimensional (2D) to three-dimensional (3D) image and video conversion which classifies a frame of a 2D input into one of a flat image class and a non-flat image class are described. The flat image class frame is directly converted into 3D stereo for display. The frame that is classified as a non-flat image class is further processed automatically and adaptively, based on complexity, to create a depth map estimate. Thereafter, the non-flat image class frame is converted into a 3D stereo image using the depth map estimate or an adjusted depth map. The adjusted depth map is processed based on the complexity.

    摘要翻译: 描述了将2D输入的帧分成平面图像类和非平坦图像类之一的复杂度自适应和自动二维(2D)到三维(3D)图像和视频转换的技术。 平面图像类框架直接转换成3D立体声显示。 基于复杂度,被分类为非平面图像类的帧被自动地和自适应地进一步处理,以创建深度图估计。 此后,使用深度图估计或调整深度图将非平面图像类帧转换成3D立体图像。 调整后的深度图基于复杂度进行处理。