摘要:
Techniques are described for encoding and decoding digital video data using macroblocks that are larger than the macroblocks prescribed by conventional video encoding and decoding standards. For example, the techniques include encoding and decoding a video stream using macroblocks comprising greater than 16×16 pixels, for example, 64×64 pixels. In one example, an apparatus includes a video encoder configured to encode a video block having a size of more than 16×16 pixels, generate block-type syntax information that indicates the size of the block, and generate a coded block pattern value for the encoded block, wherein the coded block pattern value indicates whether the encoded block includes at least one non-zero coefficient. The encoder may set the coded block pattern value to zero when the encoded block does not include at least one non-zero coefficient or set the coded block pattern value to one when the encoded block includes a non-zero coefficient.
摘要:
Quantization techniques are used in video coding to quantize residual coefficients. So-called “dead zone parameters” are selected in the quantization process of residual coefficients of residual video blocks. The dead zone refers to a region of magnitude for coefficients below which any coefficient will be quantized to zero. A method and apparatus of quantizing coefficient values of video blocks in a video coding scheme is provided. A quantization parameter is selected for a set of video blocks. Dead zone parameters are then selected for different video blocks in the set of video blocks. Next, the quantization parameter and the dead zone parameters are applied to quantize the coefficient values of each of the video blocks.
摘要:
This disclosure describes various interpolation techniques performed by an encoder and a decoder during the motion compensation process of video coding. In one example, an encoder interpolates pixel values of reference video data based on a plurality of different pre-defined interpolation filters. In this example, the decoder receives a syntax element that identifies an interpolation filter, and interpolates pixel values of reference video data based on the interpolation filter identified by the syntax element. In another example, a method of interpolating predictive video data includes generating half-pixel values based on integer pixel values, rounding the half-pixel values to generate half-pixel interpolated values, storing the half-pixel values as non-rounded versions of the half-pixel values, and generating quarter-pixel values based on the non-rounded versions of the half-pixel values and the integer pixel values.
摘要:
In general, this disclosure provides techniques for quantization of the coefficients of video blocks in a manner that can achieve a desirable balance of rate and distortion. The described techniques may analyze a plurality of quantization levels associated with each individual coefficient to select the quantization level for the individual coefficients that results in a lowest coding cost. Since CAVLC does not encode each coefficient independently, the techniques may compute the coding costs for each of the candidate quantization levels associated with the individual coefficients based on quantization levels selected for previously quantized coefficients and estimated (or predicted) quantization levels for subsequent coefficients of a coefficient vector. The quantization levels for each of the coefficients are selected based on computed coding costs to obtain a set of quantized coefficients that minimize a rate-distortion model.
摘要:
A system and method for encoding multimedia video is described. As video is encoded a quantization parameter is selected for each macroblock. As described herein, the quantization parameter for each macroblock may be selected by limiting the universe of all possible quantization parameters to a particular range of possible quantization parameter values. This increases the speed of video encoding by reducing the number of quantization parameters that are tested for each video macroblock.
摘要:
The invention includes apparatus, systems and methods for processing multimedia data. A method of processing multimedia data may include encoding a frame of the multimedia data as an I frame, a channel switch frame, and a P frame and selecting the encoded I frame if a size of the encoded I frame and a size of the encoded channel switch frame and the encoded P frame meet a first condition. An apparatus for processing multimedia data may include an encoder for encoding a frame of the multimedia data as an I frame, a channel switch frame, and a P frame and selecting the encoded I frame if a size of the encoded I frame and a size of the encoded channel switch frame and the encoded P frame meet a first condition.
摘要:
This disclosure describes frame interpolation techniques within a wavelet transform coding scheme. The frame interpolation may be used to generate one or more interpolated frames between two successive low frequency frames coded according to the wavelet transform coding scheme. Such interpolation may be useful to increase the frame rate of a multimedia sequence that is coded via wavelet transforms. Also, the techniques may be used to interpolate lost frames, e.g., which may be lost during wireless transmission.
摘要:
The invention is directed to a method and apparatus for providing temporal scaling frames for use in digital multimedia. The method involves using a removable unidirectional predicted temporal scaling frame communication along with intra-coded frames and/or inter-coded frames. The method involves the ability to selectively remove the temporal scaling frame(s) from being transmitted or decoded in order to satisfy, for example, power limits, data rate limits, computational limits or channel conditions. Examples presented include encoders, transcoders and decoders where the decision to drop the removable temporal scaling frames could be made.
摘要:
Aspects of this disclosure relate to a method of coding video data. In an example, the method includes determining a first residual quadtree (RQT) depth at which to apply a first transform to luma information associated with a block of video data, wherein the RQT represents a manner in which transforms are applied to luma information and chroma information. The method also includes determining a second RQT depth at which to apply a second transform to the chroma information associated with the block of video data, wherein the second RQT depth is different than the first RQT depth. The method also includes coding the luma information at the first RQT depth and the chroma information at the second RQT depth.
摘要:
In one example, an apparatus includes a processor configured to receive video data for two or more views of a scene, determine horizontal locations of camera perspectives for each of the two or more views, assign view identifiers to the two or more views such that the view identifiers correspond to the relative horizontal locations of the camera perspectives, form a representation comprising a subset of the two or more views, and, in response to a request from a client device, send information indicative of a maximum view identifier and a minimum view identifier for the representation to the client device.