摘要:
In one embodiment, a method for encoding or decoding video content is provided. The method includes determining a set of interpolation filters for use in interpolating sub-pel pixel values and a mapping between interpolation filters in the set of interpolation filters and different sizes of prediction units (PUs) of video content. A PU of video content is received and a size of the received PU is determined The method determines an interpolation filter in the set of interpolation filters based on a mapping between the interpolation filter and the size of the received PU to interpolate a sub-pel pixel value for use in a temporal prediction process for the PU.
摘要:
In one embodiment, a method receives a unit of video content. The unit of video content is coded in a bi-prediction mode. A motion vector predictor candidate set is determined for a first motion vector for the unit. The method then determines a first motion vector predictor from the motion vector prediction candidate set for the first motion vector and calculates a second motion vector predictor for a second motion vector for the unit of video content. The second motion vector predictor is calculated based on the first motion vector or the first motion vector predictor.
摘要:
In one embodiment, a method includes receives a prediction unit (PU) for a coding unit (CU) of video content. The PU is partitionable into a plurality of PU partition types. The method determines a PU partition type for the PU and a residual tree structure based on the PU partition type for partitioning of the CU into transform units (TUs). The residual tree includes a binary partition of a node into two. A TU partition for the PU partition type is determined based on the residual tree structure and a desired level of partitioning in the residual tree structure. The method then uses the TU partition in a transform operation.
摘要:
A method for processing a block of transform coefficients during inter coding includes receiving, during inter coding, an N×M block of transform coefficients, wherein N is a row width of the block and M is a column height of the block. The method further includes partitioning the N×M block into a plurality of sub-blocks each comprising a plurality of the transform coefficients; and processing the plurality of sub-blocks, one at a time, in a coding order along a first diagonal scan coding pattern to generate a bit sequence corresponding to the N×M block. The processing comprises, for the sub-blocks containing at least one non-zero transform coefficient, coding at least the non-zero transform coefficients in a transform coefficient sequence along a second diagonal scan coding pattern.
摘要:
In one embodiment, a method receives a current picture of video content. The method then determines a set of reference pictures for the current picture and a temporal distance from the current picture for each of the set of reference pictures. A combined list of reference pictures in the set of reference pictures is determined where an order of pictures in the combined list is based on the temporal distance for each of the set of reference pictures to the current picture. The method then uses the combined list to perform temporal prediction for the current picture.
摘要:
In various embodiments, a significance map of a matrix of video data coefficients is encoded or decoded using context-based adaptive binary arithmetic coding (CABAC). The significance map scanned line-by-line along a scanning pattern. Each line may be a vertical, horizontal, or diagonal section of the scanning pattern. Context models for each element processed in a particular line are chosen based on values of neighboring elements that are not in the line. Avoiding reliance on neighbors that are in the same line facilitates parallel processing.
摘要:
There is a coding. The coding may include preparing video compression data based on source pictures. The preparing may include partitioning the source pictures into coding units and/or generating a transform unit having a transform array. The preparing may also include processing the generated transform unit. The processing may include generating a significance map, having a significance map array with y-x locations corresponding to the y-x locations of the transform array. The processing may also include determining, utilizing a scanning pattern, a context model for coding a significance map element of the plurality of significance map elements based on a value associated with at least one coded neighbor significance map element of the plurality of significance map elements in the significance map array. There is also a decoding including processing video compression data which is generated in the coding.
摘要:
Embodiments of the invention generally provide a method and apparatus for complexity-scalable video coding. One embodiment of a method for video coding includes receiving a sequence of one or more video frames, obtaining a budget for the one or more video frames, the budget specifying a maximum number of computations that may be used in performing motion estimation for the one or more video frames, allocating the maximum number of computations among individual ones of the one or more video frames, performing motion estimation in accordance with the allocating, and outputting a motion estimate for the sequence.
摘要:
A complete automatic sprite generation system uses first-order prediction for an initial estimation, delayed elimination for outlier rejection, and field-based sprite generation for an interlaced source. Optionally, higher-order prediction for the initial estimation may be used to handle more complicated motion. The invention is useful for generating sprites, e.g., for 3D sequences, stock tickers, interactive advertising and other uses. The invention addresses outlier and fast motion problems that are not handled by the existing MPEG-4 scheme. Automatic sprite generation is provided by performing shot detection (e.g., panning or zooming) on the input images to provide a group of successive images that share a common scene for use in forming a sprite. The initial estimation of motion parameter data for forming the sprite is improved by using the motion parameter data of at least two previous input images. Delayed outlier rejection is performed in two steps by eliminating pixels whose error increases in successive sprite iterations. For interlaced input images, a sprite and set of motion parameters are encoded and transmitted for each field separately, then decoded and combined at a presentation engine at a decoder.
摘要:
A method and system of encoding and decoding digital video content. The digital video content comprises a stream of pictures which can each be intra, predicted, or bi-predicted pictures. Each of the pictures comprises macroblocks that can be further divided into smaller blocks. The method entails encoding and decoding each picture in said stream of pictures in either frame mode or in field mode.