Abstract:
An apparatus for coding video data using a single-loop decoding approach may include a memory unit and a processor in communication with the memory unit. In an embodiment, the memory unit stores the video data, the video data including a base layer and an enhancement layer. The base layer includes a base layer block, a non-constrained INTRA mode block, and an INTER mode block. The base layer block includes a sub-block located at least partially within one of the non-constrained INTRA mode block or the INTER mode block. The enhancement layer includes an enhancement layer block located at a position in the enhancement layer corresponding to a position of the base layer block in the base layer. The processor approximates pixel values of the sub-block and determines, based at least in part on the approximated pixel values, pixel values of the enhancement layer block.
Abstract:
This disclosure proposes techniques for encoding and decoding transform coefficients in a video coding process. In particular, this disclosure proposes techniques determining whether or not to apply a sign data hiding process for a group of transform coefficients, and techniques for applying the sign data hiding process. In one example, this disclosure describes a method for decoding video data comprising determining a block of transform coefficients, determining whether to perform a sign data hiding process for at least one transform coefficient in the block of transform coefficients based on a single variable compared to a threshold, and decoding sign information for the block based on the determination of whether to perform the sign data hiding process.
Abstract:
A video decoder determines, based at least in part on a size of a prediction unit (PU), whether to round either or both a horizontal or a vertical component of a motion vector of the PU from sub-pixel accuracy to integer-pixel accuracy. The video decoder generates, based at least in part on the motion vector, a predictive sample block for the PU and generates, based in part on the predictive sample block for the PU, a reconstructed sample block.
Abstract:
Aspects of this disclosure relate to, in an example, a method that includes identifying a first block of video data in a first temporal location from a first view, wherein the first block is associated with a first disparity motion vector. The method also includes determining a motion vector predictor for a second motion vector associated with a second block of video data, wherein the motion vector predictor is based on the first disparity motion vector. When the second motion vector comprises a disparity motion vector, the method includes determining the motion vector predictor comprises scaling the first disparity motion vector to generate a scaled motion vector predictor, wherein scaling the first disparity motion vector comprises applying a scaling factor comprising a view distance of the second disparity motion vector divided by a view distance of the first motion vector to the first disparity motion vector.
Abstract:
Systems and techniques are described for processing video data. For instance, a process can include processing a frame of video data using a first layer of a neural network-based video encoder, the neural network-based video encoder performing at least one quantization step. The process can further include applying an exponential-family prior to an output of the first layer of the neural network-based video encoder to generate a first layer output evaluation, generating a total loss value for the neural network-based video encoder based on a sum of a loss value for the neural network-based video encoder and the first layer output evaluation, and training the neural network-based video encoder based on the total loss value.
Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for truncation error signaling and adaptive dither for lossy bandwidth compression. A processor may perform a truncation process for data, where the data is associated with display processing, image processing, or the data processing, where the truncation process for the data results in truncated data. The processor may compute a set of truncation error values associated with the truncation process for the truncated data. The processor may generate a set of residual samples for the truncated data. The processor may generate a bitstream based on the set of residual samples for the truncated data and the set of truncation error values associated with the truncation process.
Abstract:
Techniques are described for improving intra-subpartitioning (ISP) mode for splitting coding blocks into sub-blocks. In some cases, whether ISP mode is enabled for a coding block is based on size constraints pertaining to data units (e.g., VPDUs, transform blocks, among others). For instance, based on a size constraint related to a VPDU, the ISP mode can be disabled for coding blocks crossing VPDU boundaries. In some cases, whether to enable ISP mode may be based on comparison of the width and/or height of the coding block to size thresholds corresponding to one or more maximum transform block sizes. In some cases, where the ISP mode is enabled for a coding block, a value of a flag used for defining a type of split, horizontal or vertical, for the coding block, can be inferred based on the width and/or height of the coding block relative to one or more thresholds.
Abstract:
Systems and techniques are described herein for processing video data. In some aspects, a method can include obtain, by a machine learning system, input video data. The input video data includes one or more luminance components for a current frame. The method can include determining, by the machine learning system, motion information for the luminance component(s) of the current frame and motion information for one or more chrominance components of the current frame using the luminance component(s) for the current frame. In some cases, the method can include determining the motion information for the luminance component(s) based on the luma component(s) of the current frame and at least one reconstructed luma component of a previous frame. In some cases, the method can further include determining the motion information for the chrominance component(s) of the current frame using the motion information determined for the luminance component(s) of the current frame.
Abstract:
Techniques are described herein for processing video data using a history-based rice parameter derivation. For instance, a process can include obtaining a transform block including a plurality of samples. One or more parameters (e.g., rice parameters) can be determined for the plurality of samples by analyzing a local neighborhood of a current sample of the plurality of samples and determining that a number of neighboring transform coefficients of the current sample is less than a threshold amount. A historic parameter value (e.g., a historic rice parameter value) determined from one or more previously decoded transform blocks can be obtained and, based at least in part on the historic parameter value, a parameter (e.g., a rice parameter) can be determined for the current sample. The current sample can be decoded based on the determined parameter for the current sample.
Abstract:
Systems and techniques are described herein for processing video data. In some examples, a process is described that can include obtaining at least one block of video data and predicting one or more video samples for the at least one block. The process can include obtaining a dynamic range adjustment (DRA) syntax element from the video data. In some cases, the DRA syntax element includes an indication associated with a plurality of DRA modes for the video data. The process can include processing the one or more video samples for the at least one block using a DRA mode based on the indication of the DRA syntax element.