LOW COMPLEXITY DECODER FOR COMPLEX TRANSFORM CODING OF MULTI-CHANNEL SOUND
    91.
    发明申请
    LOW COMPLEXITY DECODER FOR COMPLEX TRANSFORM CODING OF MULTI-CHANNEL SOUND 有权
    用于多通道声音复合变换编码的低复杂度解码器

    公开(公告)号:US20080319739A1

    公开(公告)日:2008-12-25

    申请号:US11767457

    申请日:2007-06-22

    IPC分类号: G10L19/00 G10L21/00 H04R5/00

    CPC分类号: G10L19/008

    摘要: A multi-channel audio decoder provides a reduced complexity processing to reconstruct multi-channel audio from an encoded bitstream in which the multi-channel audio is represented as a coded subset of the channels along with a complex channel correlation matrix parameterization. The decoder translates the complex channel correlation matrix parameterization to a real transform that satisfies the magnitude of the complex channel correlation matrix. The multi-channel audio is derived from the coded subset of channels via channel extension processing using a real value effect signal and real number scaling.

    摘要翻译: 多声道音频解码器提供了一种降低复杂度的处理,从编码的比特流重构多声道音频,其中多声道音频被表示为信道的编码子集以及复信道相关矩阵参数化。 解码器将复信道相关矩阵参数化转换为满足复信道相关矩阵幅度的实数变换。 多声道音频通过使用实数值效应信号和实数缩放的信道扩展处理从编码的信道子集导出。

    ENTROPY CODING BY ADAPTING CODING BETWEEN LEVEL AND RUN LENGTH/LEVEL MODES
    92.
    发明申请
    ENTROPY CODING BY ADAPTING CODING BETWEEN LEVEL AND RUN LENGTH/LEVEL MODES 有权
    通过适应水平和运行长度/等级模式之间的编码进行熵编码

    公开(公告)号:US20080228476A1

    公开(公告)日:2008-09-18

    申请号:US12127707

    申请日:2008-05-27

    IPC分类号: G10L19/00

    摘要: An audio encoder performs adaptive entropy encoding of audio data. For example, an audio encoder switches between variable dimension vector Huffman coding of direct levels of quantized audio data and run-level coding of run lengths and levels of quantized audio data. The encoder can use, for example, context-based arithmetic coding for coding run lengths and levels. The encoder can determine when to switch between coding modes by counting consecutive coefficients having a predominant value (e.g., zero). An audio decoder performs corresponding adaptive entropy decoding.

    摘要翻译: 音频编码器执行音频数据的自适应熵编码。 例如,音频编码器在量化音频数据的直接电平的可变维矢量霍夫曼编码和游程长度的游程级编码以及量化的音频数据的电平之间切换。 编码器可以使用例如用于对运行长度和电平进行编码的基于上下文的算术编码。 编码器可以通过计算具有主要值(例如,零)的连续系数来确定何时在编码模式之间切换。 音频解码器执行相应的自适应熵解码。

    Shape and scale parameters for extended-band frequency coding
    93.
    发明申请
    Shape and scale parameters for extended-band frequency coding 有权
    扩展频带编码的形状和缩放参数

    公开(公告)号:US20070174063A1

    公开(公告)日:2007-07-26

    申请号:US11336618

    申请日:2006-01-20

    IPC分类号: G10L19/00

    CPC分类号: G10L21/038

    摘要: An audio encoder performs frequency extension coding that comprises determining one or more shape parameters using a displacement vector that corresponds to a displacement of an even number (e.g., an even number of sub-bands between a sub-band in a baseband frequency range and a sub-band in an extended-band frequency range). The shape parameters can be determined on a per-audio-block basis. Restricting a displacement to an even number (in frequency extension coding or in other signal modulation schemes) can improve the quality of reconstructed audio. An audio encoder also can perform frequency extension coding that comprises determining one or more scale parameters at one or more audio blocks, and determining one or more anchor points for interpolating the one or more scale parameters.

    摘要翻译: 音频编码器执行频率扩展编码,其包括使用对应于偶数位移的位移矢量来确定一个或多个形状参数(例如,基带频率范围中的子带和偶数个子带之间的偶数个子带) 子带在扩展频带范围内)。 形状参数可以基于每个音频块来确定。 将位移限制为偶数(在频率扩展编码或其他信号调制方案中)可以提高重构音频的质量。 音频编码器还可以执行频率扩展编码,其包括确定一个或多个音频块处的一个或多个缩放参数,以及确定用于内插一个或多个缩放参数的一个或多个定位点。

    Complex-transform channel coding with extended-band frequency coding
    94.
    发明申请
    Complex-transform channel coding with extended-band frequency coding 有权
    具有扩展频带编码的复变换信道编码

    公开(公告)号:US20070174062A1

    公开(公告)日:2007-07-26

    申请号:US11336606

    申请日:2006-01-20

    IPC分类号: G10L21/00

    CPC分类号: G10L21/038 G10L19/008

    摘要: An audio encoder receives multi-channel audio data comprising a group of plural source channels and performs channel extension coding, which comprises encoding a combined channel for the group and determining plural parameters for representing individual source channels of the group as modified versions of the encoded combined channel. The encoder also performs frequency extension coding. The frequency extension coding can comprise, for example, partitioning frequency bands in the multi-channel audio data into a baseband group and an extended band group, and coding audio coefficients in the extended band group based on audio coefficients in the baseband group. The encoder also can perform other kinds of transforms. An audio decoder performs corresponding decoding and/or additional processing tasks, such as a forward complex transform.

    摘要翻译: 音频编码器接收包括一组多个源信道的多声道音频数据,并执行信道扩展编码,其包括对该组的组合信道进行编码,并确定用于表示该组的各个源信道的多个参数,作为编码组合的修改版本 渠道。 编码器还执行频率扩展编码。 频率扩展编码可以包括例如将多声道音频数据中的频带划分为基带组和扩展频带组,并且基于基带组中的音频系数对扩展频带组中的音频系数进行编码。 编码器还可以执行其他类型的转换。 音频解码器执行相应的解码和/或附加处理任务,例如前向复合变换。

    Complex transforms for multi-channel audio
    95.
    发明申请
    Complex transforms for multi-channel audio 有权
    复合变换为多声道音频

    公开(公告)号:US20070172071A1

    公开(公告)日:2007-07-26

    申请号:US11336403

    申请日:2006-01-20

    IPC分类号: H04R5/00

    CPC分类号: H04S3/008

    摘要: An audio encoder encodes a combined channel (e.g., a sum channel) for a group of plural physical audio channels. The encoder determines plural parameters for representing individual physical channels of the group as modified versions of the encoded combined channel. The plural parameters comprise ratios of power in each individual channel to power in the combined channel (e.g., a ratio of the power of a right channel to the power of the combined channel, and a ratio of the power of the left channel to the power of the combined channel). The plural parameters can include a complex parameter. The combined channel and the plural parameters facilitate reconstruction at the audio decoder of source channels. An audio decoder performs a forward complex transform on the multi-channel audio data and reconstructs plural channels from the multi-channel audio data. The decoder can maintain second-order statistics for the source channels.

    摘要翻译: 音频编码器对一组多个物理音频通道的组合通道(例如,和通道)进行编码。 编码器确定用于表示组的各个物理信道的多个参数,作为编码组合信道的修改版本。 多个参数包括每个单独信道中的功率与组合信道中的功率的比率(例如,右信道的功率与组合信道的功率的比率,以及左信道的功率与功率的比率 的组合通道)。 多个参数可以包括复参数。 组合通道和多个参数便于在源通道的音频解码器上进行重建。 音频解码器对多声道音频数据执行前向复合变换,并从多声道音频数据重建多个声道。 解码器可以维护源通道的二阶统计。

    Frequency segmentation to obtain bands for efficient coding of digital media
    97.
    发明申请
    Frequency segmentation to obtain bands for efficient coding of digital media 有权
    频率分割以获得有效编码数字媒体的频带

    公开(公告)号:US20070016412A1

    公开(公告)日:2007-01-18

    申请号:US11183087

    申请日:2005-07-15

    IPC分类号: G10L19/02

    CPC分类号: G10L19/0208 G10L19/24

    摘要: Frequency segmentation is important to the quality of encoding spectral data. Segmentation involves breaking the spectral data into units called sub-bands or vectors. Homogeneous segmentation may be suboptimal. Various features are described for providing spectral data intensity dependent segmentation. Finer segmentation is provided for regions of greater spectral variance and coarser segmentation is provided for more homogeneous regions. Sub-bands which have similar characteristics may be merged with very little effect on quality, whereas sub-bands with highly variable data may be better represented if a sub-band is split. Various methods are described for measuring tonality, energy, or shape of a sub-band. These various measurements are discussed in light of making decisions of when to split or merge sub-bands to provide variable frequency segmentation.

    摘要翻译: 频率分割对于编码光谱数据的质量很重要。 分段涉及将频谱数据分解成称为子带或向量的单元。 均匀分割可能不是最佳的。 描述了各种特征,用于提供频谱数据强度相关分割。 为更大的频谱方差的区域提供更精细的分割,为更均匀的区域提供较粗的分割。 具有相似特性的子带可以对质量影响很小,而如果子带被分割,则可以更好地表示具有高度可变数据的子带。 描述了用于测量子带的音调,能量或形状的各种方法。 根据决定何时拆分或合并子带以提供可变频率分段来讨论这些各种测量。

    Text detection in continuous tone image segments
    98.
    发明授权
    Text detection in continuous tone image segments 失效
    连续色调图像段中的文本检测

    公开(公告)号:US07085420B2

    公开(公告)日:2006-08-01

    申请号:US10186887

    申请日:2002-06-28

    申请人: Sanjeev Mehrotra

    发明人: Sanjeev Mehrotra

    IPC分类号: G06K9/36 G06K9/34

    CPC分类号: G06T9/00

    摘要: For encoding of mixed-mode images containing text and continuous-tone content, the pixels in the image that form the text content are detected and separated. Text detection classifies pixels as text or continuous tone content by accumulating pixel counts for groups of contiguous, non-smooth pixels with the same color. Groups whose pixel count exceeds a threshold are classified as text. The text detection technique further reduces classification errors by testing for boundary dimensions and pixel density of the group characteristic of long straight lines or large borders. The text detection technique further searches the neighborhood of groups qualifying as text for pixels of the same color, so as to also detect pixels for isolated text marks like dots, accents or punctuation. The separated text and continuous-tone content can be encoded separately for efficient compression while preserving text quality, and the text again superimposed on the continuous tone content at decompression.

    摘要翻译: 对于包含文本和连续色调内容的混合模式图像的编码,检测和分离形成文本内容的图像中的像素。 文本检测通过为具有相同颜色的连续非平滑像素组累加像素计数来将像素分类为文本或连续色调内容。 像素数超过阈值的组被分类为文本。 文本检测技术通过测试长直线或大边界特征组的边界尺寸和像素密度,进一步减少分类误差。 文本检测技术进一步搜索符合相同颜色的像素的文本的组的邻域,以便还检测像点,重音符或标点符号之类的孤立文本标记的像素。 分离的文本和连续色调内容可以单独编码以进行有效压缩,同时保持文本质量,并且文本在解压缩时再次叠加在连续色调内容上。