Efficient rate control techniques for video encoding
    2.
    发明授权
    Efficient rate control techniques for video encoding 有权
    高效的视频编码速率控制技术

    公开(公告)号:US07606427B2

    公开(公告)日:2009-10-20

    申请号:US11019331

    申请日:2004-12-21

    IPC分类号: G06K9/36 G06K9/46

    摘要: This disclosure describes rate control techniques that can improve video encoding. In particular, the described rate control techniques exploit relationships between the number of bits encoded per frame and the number of non-zero coefficients of the video blocks after quantization. The number of number of non-zero coefficients of the video blocks after quantization is referred to as rho (ρ). The value of ρ is generally proportional to the number of bits used in the video encoding. This disclosure utilizes a relationship between ρ and a quantization parameter (QP) in order to achieve rate controlled video encoding. More specifically, this disclosure provides techniques for generating a lookup table (LUT) that maps values of ρ to different QPs.

    摘要翻译: 本公开描述了可以改进视频编码的速率控制技术。 特别地,所描述的速率控制技术利用每帧编码的位数与量化后视频块的非零系数的数目之间的关系。 将量化后的视频块的非零系数的个数称为rho(rho)。 rho的值通常与视频编码中使用的比特数成比例。 本公开利用rho和量化参数(QP)之间的关系,以便实现速率控制的视频编码。 更具体地,本公开提供了用于生成将rho的值映射到不同QP的查找表(LUT)的技术。

    Interactive speech recognition apparatus and method with conditioned voice prompts
    3.
    发明授权
    Interactive speech recognition apparatus and method with conditioned voice prompts 有权
    交互式语音识别装置和方法,具有条件语音提示

    公开(公告)号:US07328159B2

    公开(公告)日:2008-02-05

    申请号:US10050378

    申请日:2002-01-15

    IPC分类号: G10L21/00 G10L15/20 G10L13/00

    CPC分类号: G10L15/22 G10L25/78

    摘要: An improved system for an interactive voice recognition system (400) includes a voice prompt generator (401) for generating voice prompt in a first frequency band (501). A speech detector (406) detects presence of speech energy in a second frequency band (502). The first and second frequency bands (501, 502) are essentially conjugate frequency bands. A voice data generator (412) generates voice data based on an output of the voice prompt generator (401) and audible speech of a voice response generator (402). A control signal (422) controls the voice prompt generator (401) based on whether the speech detector (406) detects presence of speech energy in the second frequency band (502). A back end (405) of the interactive voice recognition system (400) is configured to operate on an extracted front end voice feature based on whether the speech detector (406) detects presence of speech energy in the second frequency band (502).

    摘要翻译: 用于交互式语音识别系统(400)的改进的系统包括用于在第一频带(501)中生成语音提示的语音提示生成器(401)。 语音检测器(406)检测第二频带(502)中语音能量的存在。 第一和第二频带(501,502)基本上是共轭频带。 语音数据生成器(412)基于语音提示生成器(401)的输出和语音响应生成器(402)的可听话音生成语音数据。 控制信号(422)基于语音检测器(406)是否检测到第二频带(502)中的语音能量的存在来控制语音提示产生器(401)。 交互式语音识别系统(400)的后端(405)被配置为基于所述语音检测器(406)是否检测到所述第二频带(502)中的语音能量的存在来对所提取的前端语音特征进行操作。

    Intensity compensation techniques in video processing
    4.
    发明授权
    Intensity compensation techniques in video processing 有权
    视频处理中的强度补偿技术

    公开(公告)号:US08599920B2

    公开(公告)日:2013-12-03

    申请号:US12185889

    申请日:2008-08-05

    IPC分类号: H04N7/12

    摘要: Techniques for intensity compensation in video processing are provided. In one configuration, a wireless communication device compliant with the VC1-SMPTE standard (e.g., cellular phone, etc.) comprises a processor that is configured to execute instructions operative to reconstruct reference frames from a received video bitstream. A non-intensity-compensated copy of a reference frame of the bitstream is stored in a memory of the device and used for defining the displayable images and for on-the-fly generation of a stream of intensity-compensated pixels to perform motion compensation calculations for frames of the video bitstream.

    摘要翻译: 提供了视频处理中的强度补偿技术。 在一种配置中,符合VC1-SMPTE标准(例如,蜂窝电话等)的无线通信设备包括被配置为执行操作以从接收的视频比特流重建参考帧的指令的处理器。 比特流的参考帧的非强度补偿的副本被存储在设备的存储器中,并用于定义可显示的图像,并且用于实时生成强度补偿像素的流以执行运动补偿计算 用于视频比特流的帧。

    Two pass rate control techniques for video coding using a min-max approach
    6.
    发明授权
    Two pass rate control techniques for video coding using a min-max approach 有权
    两种通过速率控制技术,用于使用最小 - 最大方法进行视频编码

    公开(公告)号:US08379721B2

    公开(公告)日:2013-02-19

    申请号:US11303618

    申请日:2005-12-15

    IPC分类号: H04N11/04

    摘要: This disclosure describes rate control techniques that can improve video coding based on a “two-pass” approach. The first pass codes a video sequence using a first set of quantization parameters (QPs) for the purpose of estimating rate-distortion characteristics of the video sequence based on the statistic of the first pass. A second set of QPs can then be defined for a second coding pass. The estimated rate-distortion characteristics of the first pass are used to select Qps for the second pass in a manner that minimizes quality fluctuation between the frames of the video sequence. Furthermore, selection of the second set of QPs may also substantially maximize quality of the frames at the substantially minimized quality flucuation in order to achieve low average frame distortion with the minimized quality fluctuation.

    摘要翻译: 本公开描述了可以基于双向方法改进视频编码的速率控制技术。 第一次通过使用第一组量化参数(QP)编码视频序列,以便基于第一遍的统计量来估计视频序列的速率失真特性。 然后可以为第二编码通道定义第二组QP。 第一遍的估计速率 - 失真特性用于以最小化视频序列的帧之间的质量波动的方式来选择第二遍的Qps。 此外,第二组QP的选择也可以在基本上最小化的质量流量下基本上最大化帧的质量,以便以最小化的质量波动实现低平均帧失真。

    Electronic video image stabilization
    7.
    发明授权
    Electronic video image stabilization 有权
    电子视频图像稳定

    公开(公告)号:US07840085B2

    公开(公告)日:2010-11-23

    申请号:US11487078

    申请日:2006-07-14

    IPC分类号: G06K9/40

    摘要: This disclosure describes electronic video image stabilization techniques for imaging and video devices. The techniques involve determining motion and spatial statistics for individual macroblocks of a frame, and determining a global motion vector for the frame based on the statistics of each of the macroblocks. In one embodiment, a method of performing electronic image stabilization includes performing spatial estimation on each of a plurality of macroblocks within a frame of an image to obtain spatial statistics for each of the macroblocks, performing motion estimation on each of the plurality of macroblocks to obtain motion statistics for each of the macroblocks, integrating the spatial statistics and the motion statistics of each of the macroblocks to determine a global motion vector for the frame, and offsetting the image with respect to a reference window according to the global motion vector.

    摘要翻译: 本公开描述了用于成像和视频设备的电子视频图像稳定技术。 这些技术涉及确定帧的各个宏块的运动和空间统计,以及基于每个宏块的统计来确定该帧的全局运动矢量。 在一个实施例中,执行电子图像稳定的方法包括对图像的帧内的多个宏块中的每一个执行空间估计,以获得每个宏块的空间统计,对多个宏块中的每一个执行运动估计以获得 对每个宏块进行运动统计,对每个宏块的空间统计和运动统计进行积分,以确定该帧的全局运动矢量,以及根据全局运动矢量相对于参考窗口偏移该图像。

    Method for robust voice recognition by analyzing redundant features of source signal
    8.
    发明授权
    Method for robust voice recognition by analyzing redundant features of source signal 有权
    通过分析源信号的冗余特征进行强大的语音识别的方法

    公开(公告)号:US06957183B2

    公开(公告)日:2005-10-18

    申请号:US10104178

    申请日:2002-03-20

    IPC分类号: G10L15/02 G10L15/20 G10L15/00

    CPC分类号: G10L15/02 G10L15/20

    摘要: A method for processing digitized speech signals by analyzing redundant features to provide more robust voice recognition. A primary transformation is applied to a source speech signal to extract primary features therefrom. Each of at least one secondary transformation is applied to the source speech signal or extracted primary features to yield at least one set of secondary features statistically dependant on the primary features. At least one predetermined function is then applied to combine the primary features with the secondary features. A recognition answer is generated by pattern matching this combination against predetermined voice recognition templates.

    摘要翻译: 一种通过分析冗余特征来提供更强大的语音识别来处理数字化语音信号的方法。 主变换应用于源语音信号以从其提取主要特征。 将至少一个次要变换中的每一个应用于源语音信号或提取的主要特征以产生统计学上取决于主要特征的至少一组次要特征。 然后施加至少一个预定功能以将主要特征与次要特征组合。 通过将该组合与预定的语音识别模板进行匹配来生成识别答案。

    Video coding with fine granularity scalability using cycle-aligned fragments
    9.
    发明授权
    Video coding with fine granularity scalability using cycle-aligned fragments 有权
    使用循环对齐片段的细粒度可扩展性的视频编码

    公开(公告)号:US08233544B2

    公开(公告)日:2012-07-31

    申请号:US11776679

    申请日:2007-07-12

    IPC分类号: H04N7/12 H04N11/02

    CPC分类号: H04N19/34

    摘要: The disclosure describes FGS video coding techniques that use cycle-aligned fragments (CAFs). The techniques may perform cycle-based coding of FGS video data block coefficients and syntax elements, and encapsulate cycles in fragments for transmission. The fragments may be cycle-aligned such that a start of a payload of each of the fragments substantially coincides with a start of one of the cycles. In this manner, cycles can be readily accessed via individual fragments. Some cycles may be controlled with a vector mode to scan to a predefined position within a block before moving to another block. In this manner, the number of cycles can be reduced, reducing the number of fragments and associated overhead. The CAFs may be entropy coded independently of one another so that each fragment may be readily accessed and decoded without waiting for decoding of other fragments. Independent entropy coding may permit parallel decoding and simultaneous processing of fragments.

    摘要翻译: 本公开描述了使用循环对准片段(CAF)的FGS视频编码技术。 这些技术可以执行FGS视频数据块系数和语法元素的基于循环的编码,并且将循环封装成用于传输的片段。 片段可以是循环对齐的,使得每个片段的有效载荷的开始基本上与循环中的一个的开始重合。 以这种方式,可以容易地通过各个片段访问周期。 可以通过向量模式来控制一些周期,以便在移动到另一个块之前扫描到块内的预定位置。 以这种方式,可以减少周期数,减少片段的数量和相关的开销。 CAF可以彼此独立地进行熵编码,使得每个片段可以容易地被访问和解码,而不等待其他片段的解码。 独立熵编码可以允许并行解码和片段的同时处理。

    Video encoding
    10.
    发明授权
    Video encoding 失效
    视频编码

    公开(公告)号:US08208548B2

    公开(公告)日:2012-06-26

    申请号:US11351911

    申请日:2006-02-09

    IPC分类号: H04B1/66

    摘要: An embodiment is directed to a method for selecting a predictive macroblock partition from a plurality of candidate macroblock partitions in motion estimation and compensation in a video encoder including determining a bit rate signal for each of the candidate macroblock partitions, generating a distortion signal for each of the candidate macroblock partitions, calculating a cost for each of the candidate macroblock partitions based on respective bit rate and distortion signals to produce a plurality of costs, and determining a motion vector from the costs. The motion vector designates the predictive macroblock partition.

    摘要翻译: 一个实施例涉及一种用于在视频编码器中的运动估计和补偿中从多个候选宏块分区中选择预测宏块分区的方法,包括确定每个候选宏块分区的比特率信号,为每个候选宏块分区生成失真信号 候选宏块分区,基于相应的比特率和失真信号计算每个候选宏块分区的成本以产生多个成本,以及从成本确定运动矢量。 运动矢量指定预测宏块分区。