Slice-layer in video codec
    61.
    发明申请
    Slice-layer in video codec 有权
    视频编解码器中的切片层

    公开(公告)号:US20050053158A1

    公开(公告)日:2005-03-10

    申请号:US10933960

    申请日:2004-09-03

    摘要: A video encoder/decoder utilizes a bistream syntax that provides an independently decodable, partial picture unit, which may be in the form of a unit containing one or more contiguous rows of macroblocks (called a slice). This slice layer provides a flexible combination of error-resilience and compression efficiency. The slice layer encodes an efficient addressing mechanism (e.g., a syntax element specifying a beginning macroblock row of the slice layer), as well as an efficient mechanism to optionally retransmit picture header information. The slice layer provides decoding and reconstruction independence by disabling all forms of prediction, overlap and loop-filtering across slice-boundaries. This permits a slice coded in intra-mode to be reconstructed error-free, irrespective of errors in other regions of the picture.

    摘要翻译: 视频编码器/解码器利用提供独立可解码的部分图像单元的双向串行语法,其可以是包含一个或多个连续的宏块行(称为切片)的单元的形式。 该切片层提供了弹性和压缩效率的灵活组合。 切片层对有效的寻址机制(例如,指定切片层的开始宏块行的语法元素)进行编码,以及有选择地重传图像头信息的机制。 切片层通过禁用跨片段的所有形式的预测,重叠和循环滤波来提供解码和重建独立性。 这允许在帧内编码的片段无错重构,而与图像的其他区域中的错误无关。

    Sound source localization using phase spectrum
    62.
    发明授权
    Sound source localization using phase spectrum 有权
    声源定位使用相位谱

    公开(公告)号:US09435873B2

    公开(公告)日:2016-09-06

    申请号:US13182449

    申请日:2011-07-14

    摘要: An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.

    摘要翻译: 放置在移动机器人上的麦克风阵列提供多个音频信号通道。 接收到的一组音频信号被称为音频片段,其被分成多个帧。 对来自每对麦克风的信号的帧进行相位分析。 如果在帧期间两个麦克风处于活动状态,则为每个这样的麦克风生成候选角度。 结果是帧的候选角度列表。 处理该列表以选择帧的最终候选角度。 候选角度的列表随着时间被跟踪以帮助选择音频片段的最终候选角度的过程。

    Entropy coding efficiency enhancement utilizing energy distribution remapping
    63.
    发明授权
    Entropy coding efficiency enhancement utilizing energy distribution remapping 有权
    使用能量分配重新映射的熵编码效率增强

    公开(公告)号:US09398314B2

    公开(公告)日:2016-07-19

    申请号:US12026534

    申请日:2008-02-05

    IPC分类号: H04N7/12 H04N19/85

    CPC分类号: H04N19/85

    摘要: Architecture for enhancing the compression (e.g., luma, chroma) of a video signal and improving the perceptual quality of the video compression schemes. The architecture operates to reshape the normal multimodal energy distribution of the input video signal to a new energy distribution. In the context of luma, the algorithm maps the black and white (or contrast) information of a picture to a new energy distribution. For example, the contrast can be enhanced in the middle range of the luma spectrum, thereby improving the contrast between a light foreground object and a dark background. At the same time, the algorithm reduces the bit-rate requirements at a particular quantization step size. The algorithm can be utilized also in post-processing to improve the quality of decoded video.

    摘要翻译: 用于增强视频信号的压缩(例如,亮度,色度)并提高视频压缩方案的感知质量的体系结构。 该架构用于将输入视频信号的正常多模态能量分布重新形成新的能量分布。 在亮度的上下文中,该算法将图像的黑白(或对比)信息映射到新的能量分布。 例如,可以在亮度光谱的中间范围内增强对比度,从而改善光前景物体和暗背景之间的对比度。 同时,该算法降低了特定量化步长的比特率要求。 该算法也可用于后处理,以提高解码视频的质量。

    ESTIMATING SAMPLE-DOMAIN DISTORTION IN THE TRANSFORM DOMAIN WITH ROUNDING COMPENSATION
    64.
    发明申请
    ESTIMATING SAMPLE-DOMAIN DISTORTION IN THE TRANSFORM DOMAIN WITH ROUNDING COMPENSATION 有权
    估算变形域中的样本域失真与圆周补偿

    公开(公告)号:US20120020409A1

    公开(公告)日:2012-01-26

    申请号:US13248784

    申请日:2011-09-29

    IPC分类号: H04N7/26

    摘要: Techniques and tools are described for compensating for rounding when estimating sample-domain distortion in the transform domain. For example, a video encoder estimates pixel-domain distortion in the transform domain for a block of transform coefficients after compensating for rounding in the DC coefficient of the block. In this way, the video encoder improves the accuracy of pixel-domain distortion estimation but retains the computational advantages of performing the estimation in the transform domain. Rounding compensation includes, for example, looking up an index (from a de-quantized transform coefficient) in a rounding offset table to determine a rounding offset, then adjusting the coefficient by the offset. Other techniques and tools described herein are directed to creating rounding offset tables and encoders that make encoding decisions after considering rounding effects that occur after an inverse frequency transform on de-quantized transform coefficient values.

    摘要翻译: 描述了在估计变换域中的样本域失真时补偿舍入的技术和工具。 例如,视频编码器在补偿块的DC系数中的舍入后估计变换系数块的变换域中的像素域失真。 以这种方式,视频编码器提高了像素域失真估计的精度,但保留了在变换域中执行估计的计算优点。 舍入补偿包括例如在舍入偏移表中查找索引(来自去量化的变换系数)以确定舍入偏移,然后将系数调整为偏移。 本文描述的其他技术和工具旨在创建舍入偏移表和编码器,其在考虑在对于量化后的变换系数值进行逆频率变换之后出现的舍入效应时进行编码决定。

    NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE
    65.
    发明申请
    NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE 有权
    噪音强大的语音分类器ENSEMBLE

    公开(公告)号:US20100280827A1

    公开(公告)日:2010-11-04

    申请号:US12433143

    申请日:2009-04-30

    IPC分类号: G10L15/00

    摘要: Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.

    摘要翻译: 公开了实现包括语音分类器集合的语音识别系统的实施例。 根据一个实施例,语音识别系统包括将表示语音向量的特征向量转换为对数概率集的分类器集合。 分类器集合包括多个分类器。 语音识别系统包括将对数概率集合变换为输出符号序列的解码器集合。 该语音识别系统还包括一个查询组件,用于使用输出符号序列从语音数据库中检索一个或多个语音话语。

    Estimating sample-domain distortion in the transform domain with rounding compensation
    69.
    发明授权
    Estimating sample-domain distortion in the transform domain with rounding compensation 有权
    使用舍入补偿估计变换域中的样本域失真

    公开(公告)号:US08249145B2

    公开(公告)日:2012-08-21

    申请号:US13248784

    申请日:2011-09-29

    IPC分类号: H04N7/12

    摘要: Techniques and tools are described for compensating for rounding when estimating sample-domain distortion in the transform domain. For example, a video encoder estimates pixel-domain distortion in the transform domain for a block of transform coefficients after compensating for rounding in the DC coefficient of the block. In this way, the video encoder improves the accuracy of pixel-domain distortion estimation but retains the computational advantages of performing the estimation in the transform domain. Rounding compensation includes, for example, looking up an index (from a de-quantized transform coefficient) in a rounding offset table to determine a rounding offset, then adjusting the coefficient by the offset. Other techniques and tools described herein are directed to creating rounding offset tables and encoders that make encoding decisions after considering rounding effects that occur after an inverse frequency transform on de-quantized transform coefficient values.

    摘要翻译: 描述了在估计变换域中的样本域失真时补偿舍入的技术和工具。 例如,视频编码器在补偿块的DC系数中的舍入后估计变换系数块的变换域中的像素域失真。 以这种方式,视频编码器提高了像素域失真估计的精度,但保留了在变换域中执行估计的计算优点。 舍入补偿包括例如在舍入偏移表中查找索引(来自去量化的变换系数)以确定舍入偏移,然后将系数调整为偏移。 本文描述的其他技术和工具旨在创建舍入偏移表和编码器,其在考虑在对于量化后的变换系数值进行逆频率变换之后出现的舍入效应时进行编码决定。

    Switching distortion metrics during motion estimation
    70.
    发明授权
    Switching distortion metrics during motion estimation 有权
    运动估计过程中的切换失真度量

    公开(公告)号:US08155195B2

    公开(公告)日:2012-04-10

    申请号:US11400049

    申请日:2006-04-07

    CPC分类号: H04N19/567

    摘要: Techniques and tools for switching distortion metrics during motion estimation are described. For example, a video encoder determines a distortion metric selection criterion for motion estimation. The criterion can be based on initial results of the motion estimation. To evaluate the criterion, the encoder can compare the criterion to a threshold that depends on a current quantization parameter. The encoder selects between multiple available distortion metrics, which can include a sample-domain distortion metric (e.g., SAD) and a transform-domain distortion metric (e.g., SAHD). The encoder uses the selected distortion metric in the motion estimation. Selectively switching between SAD and SAHD provides rate-distortion performance superior to using only SAD or only SAHD. Moreover, due to the lower complexity of SAD, the computational complexity of motion estimation with SAD-SAHD switching is typically less than motion estimation that always uses SAHD.

    摘要翻译: 描述了在运动估计期间切换失真度量的技术和工具。 例如,视频编码器确定用于运动估计的失真度量选择标准。 该标准可以基于运动估计的初始结果。 为了评估标准,编码器可以将标准与​​取决于当前量化参数的阈值进行比较。 编码器在可以包括采样域失真度量(例如SAD)和变换域失真度量(例如,SAHD)的多个可用失真度量之间进行选择。 编码器在运动估计中使用所选择的失真度量。 选择性地切换SAD和SAHD之间的速率失真性能优于仅使用SAD或仅SAHD。 此外,由于SAD的较低的复杂度,SAD-SAHD切换的运动估计的计算复杂度通常小于始终使用SAHD的运动估计。