摘要:
Systems and methods for determining quantization parameter (QP) for video coding. Embodiments may be particularly advantageous for strongly temporal correlated frames, such as for video conferencing applications. An initial QP for a frame of a video sequence may be modified based on a spatial complexity or a temporal complexity associated with the video frame, and/or based on an inter-predicted frame bitrate target cycle, as a function of whether the frame is intra- or inter-predicted. The inter-predicted frame bitrate target cycle includes a sequence of two or more inter-predicted frame bitrate targets that are assigned to the frame according to the inter-predicted frame bitrate target cycle. A reference frame for an inter-predicted frame may be selected based on the bitrate target associated with candidate reference frames. Initial QP of an inter-predicted frame with a scene change may be modified in a manner independent of an inter-predicted frame bitrate target cycle.
摘要:
A method and system for processing transform blocks according to quantization matrices in a video coding system are disclosed. Embodiments of the present invention derive one or more derived quantization matrices from one or more initial quantization matrices or from one previously derived quantization matrix. In one embodiment, the initial quantization matrices include a 4×4 and 8×8 quantization matrices, which can be either default or user-defined. All quantization matrices larger than 8×8 can be derived from the 4×4 and 8×8 initial quantization matrices. Non-square quantization matrices can be derived from at least one initial square quantization matrix or at least one derived square quantization matrix. Individual initial quantization matrices may be used to derive respective larger quantization matrices. Furthermore, the individual initial quantization matrices may be derived from larger quantization matrices designed for corresponding transform sizes. Syntax design to enable the quantization matrix representation is also disclosed.
摘要:
A method and apparatus for Intra prediction of a block based on neighboring pixels around the block are disclosed. Embodiments according to the present invention use square blocks as well as non-square blocks for Intra prediction. For a 2N×2N Luma CU (coding unit), the CU can be partitioned into 2N×N, N×2N, 2N×2N or N×N PUs. The 2N×N and N×2N PUs can be further processed by either square transforms only or both non-square and square transforms. In one embodiment, the 2N×N PU or the N×2N PU is processed as two N×N TUs (transform units) and each of the N×N TU is further split into smaller N×N TUs based on quad-tree split. In another embodiment, the 2N×N and N×2N PUs are processed as two 2N×0.5N and 0.5N×2N TUs respectively.
摘要:
A method and apparatus for adaptive inter prediction mode coding are disclosed. In the current HEVC, a fixed set of variable length codes is used for the underlying video data, which may not optimally match the statistics of underlying video data. Consequently, the compression efficiency associated with the fixed set of variable length codes will be compromised. Accordingly, an adaptive coding scheme for inter prediction modes is disclosed. The variable length codes used for each inter prediction mode in each coding unit depth is adaptively determined by its respective statistics. The statistics can be measured as the frequency of occurrence of each mode. In one embodiment according to the present invention, counters are used to collect the statistics. According to one embodiment of the present invention, the statistics of inter prediction modes are collected from the previous slice and the set of variable length codes is determined for the subsequent slice (immediately following the previous slice) accordingly. According to another embodiment of the present invention, the statistics of inter prediction modes are updated for each coding unit and the variable length code for each mode is adjusted according to the statistics change during the coding process. According to another embodiment of the present invention, the variable length code for each mode is reset in the beginning of each slice. The reset code word table is either a predefined code word table for whole sequence or a code word table determined by the previous slice.
摘要:
A system and method for effectively performing an adaptive encoding procedure includes a texture analyzer that initially determines texture characteristics for blocks of input image data. An image transformer converts the blocks of image data into sets of coefficients that represent the various blocks. A block categorizer utilizes the texture characteristics to associate texture categories with the sets of coefficients from the various blocks. Deadzone tables are provided for storing deadzone values that define deadzone regions for performing appropriate quantization procedures. A quantizer may then access the deadzone values from the deadzone tables to adaptively convert the coefficients into quantized coefficients according to their corresponding texture characteristics.
摘要:
Content adaptive detection of images having stand-out objects involves block variance-based detection and determining if an object includes a stand-out object. The images with a stand-out object are further processed to isolate an object of interest. The images without a detected stand-out object are further processed with a transition map-based detection method which includes generating a transition map. If an object portrait is determined from the transition map, then the image is further processed to isolate the object of interest.
摘要:
A method and apparatus for forming a demosaiced image from a color-filter-array (“CFA”) image is provided. The CFA image comprises a first set of pixels colored according to a first (e.g., a green) color channel, a second set of pixels colored according to a second (e.g., a red) color channel and a third set of pixels colored according to a third (e.g., blue) color channel. The method may include obtaining an orientation map, which includes, for each pixel of the color-filter-array image, an indicator of orientation of an edge bounding such pixel. The method may further include interpolating the first color channel at the second and third sets of pixels as a function of the orientation map so as to form a fourth set of pixels. The method may also include interpolating the second color channel at the first and third sets of pixels as a function of the orientation map and the fourth set of pixels; and interpolating the third color channel at the first and second sets of pixels as a function of the orientation map and the fourth set of pixels.
摘要:
A video system includes: analyzing video data, having a block; performing a transition change detection for determining a spatial intensity transition within the block; performing a block-wise similarity measurement on the block in the video data for identifying a blocking artifact; and filtering with a two dimensional cross filter every pixel in the block for removing the blocking artifact.
摘要:
Low complexity edge detection and DCT type selection method to improve the visual quality of H.264/AVC encoded video sequence is described. Encoding-generated information is reused to detect an edge macroblock. Variance and Mean Absolute Difference (MAD) of one macroblock shows a certain relationship that is able to be used to differentiate the edge macroblock and the non-edge macroblock. Also, the variance difference of neighbor macroblocks provides a hint for edge existence. Then, a block-based edge detection method uses this information. To determine the DCT type for each block, the detected edges are differentiated as visual obvious edge, texture-like edge, soft edge and strong edge. 8×8 DCT is used for texture-like edges and the 4×4 DCT is used for all the other edges. The result is an efficient and accurate edge detection and transform selection method.
摘要:
In one embodiment, a coding mode selection method is provided to improve the visual quality of an encoded video sequence. The coding mode is selected based on a human visual tolerance level. Picture data may be received for a video coding process. The picture data is then analyzed to determine human visual tolerance adjustment information. For example, parameters of a cost equation may be adjusted based on the human visual tolerance level, which may be a tolerance that is based on a distortion bound that the human visual system can tolerate. The picture data may be analyzed in places that are considered visually sensitive areas, such as trailing suspicious areas, stripping suspicious areas, picture boundary areas, and/or blocking suspicious areas. Depending on what kind of visually sensitive area is found in the picture data, a parameter in a cost equation may be adjusted based on different visual tolerance thresholds. The coding mode is then determined based on the cost.