Abstract:
Techniques are presented for modifying images of an object in video, for example to correct for lens distortion, or to beautify a face. These techniques include extracting and validating features of an object from a source video frame, tracking those features over time, estimating a pose of the object, modifying a 3D model of the object based on the features, and rendering a modified video frame based on the modified 3D model and modified intrinsic and extrinsic matrices. These techniques may be applied in real-time to an object in a sequence of video frames.
Abstract:
Techniques are disclosed for coding video data predictively based on predictions made from spherical-domain projections of input pictures to be coded and reference pictures that are prediction candidates. Spherical projection of an input picture and the candidate reference pictures may be generated. Thereafter, a search may be conducted for a match between the spherical-domain representation of a pixel block to be coded and a spherical-domain representation of the reference picture. On a match, an offset may be determined between the spherical-domain representation of the pixel block to a matching portion of the of the reference picture in the spherical-domain representation. The spherical-domain offset may be transformed to a motion vector in a source-domain representation of the input picture, and the pixel block may be coded predictively with reference to a source-domain representation of the matching portion of the reference picture.
Abstract:
Techniques are presented for modifying images of an object in video, for example to correct for lens distortion, or to beautify a face. These techniques include extracting and validating features of an object from a source video frame, tracking those features over time, estimating a pose of the object, modifying a 3D model of the object based on the features, and rendering a modified video frame based on the modified 3D model and modified intrinsic and extrinsic matrices. These techniques may be applied in real-time to an object in a sequence of video frames.
Abstract:
Image processing techniques may accelerate coding of viewport data contained within multi-view image data. According to such techniques, an encoder may shifting content of a multi-directional image data according to the viewport location data provided by a decoder. The encoder may code the shifted multi-directional image data by predictive coding, and transmit to the decoder, the coded multi-directional image data and data identifying an amount of the shift. Doing so may move the viewport location to positions in the image data that are coded earlier than the positions that the viewport location naturally occupies and, thereby, may accelerate coding. On decode, a decoder may compare its present viewport location with viewport location data provided by the encoder with coded video data. The decoder may decode the coded video data and extract a portion of the decoded video data corresponding to a present viewport location for display.
Abstract:
Techniques are disclosed for estimating quality of images in an automated fashion. According to these techniques, a source image may be downsampled to generate at least two downsampled images at different levels of downsampling. Blurriness of the images may be estimated starting with a most-heavily downsampled image. Blocks of a given image may be evaluated for blurriness and, when a block of a given image is estimated to be blurry, the block of the image and co-located blocks of higher resolution image(s) may be designated as blurry. Thereafter, a blurriness score may be calculated for the source image from the number of blocks of the source image designated as blurry.
Abstract:
A system for processing media on a resource restricted device, the system including a memory to store data representing media assets and associated descriptors, and program instructions representing an application and a media processing system, and a processor to execute the program instructions, wherein the program instructions represent the media processing system, in response to a call from an application defining a plurality of services to be performed on an asset, determine a tiered schedule of processing operations to be performed upon the asset based on a processing budget associated therewith, and iteratively execute the processing operations on a tier-by-tier basis, unless interrupted.
Abstract:
Judder artifacts are remedied in video coding system by employing frame rate conversion at an encoder. According to the disclosure, a source video sequence may be coded as base layer coded video at a first frame rate. An encoder may identify a portion of the coded video sequence that likely will exhibit judder effects when decoded. For those portions that likely will exhibit judder effects, video data representing the portion of the source video may be coded at a higher frame rate than a frame rate of the coded base layer data as enhancement layer data. Moreover, an encoder may generate metadata representing “FRC hints”—techniques that a decoder should employ when performing decoder-side frame rate conversion. An encoding terminal may transmit the base layer coded video and either the enhancement layer coded video or the FRC hints to a decoder. Thus, encoder infrastructure may mitigate against judder artifacts that may arise during decoding.
Abstract:
A system may include a receiver, a decoder, a post-processor, and a controller. The receiver may receive encoded video data. The decoder may decode the encoded video data. The post-processor may perform post-processing on frames of decoded video sequence from the decoder. The controller may adjust post-processing of a current frame, based upon at least one condition parameters detected at the system.
Abstract:
A scalable coding system codes video as a base layer representation and an enhancement layer representation. A base layer coder may code an LDR representation of a source video. A predictor may predict an HDR representation of the source video from the coded base layer data. A comparator may generate prediction residuals which represent a difference between an HDR representation of the source video and the predicted HDR representation of the source video. A quantizer may quantize the residuals down to an LDR representation. An enhancement layer coder may code the LDR residuals. In other embodiments, the enhancement layer coder may code LDR-converted HDR video directly.
Abstract:
Systems and methods are provided for processing high quality video data, such as data having a higher than standard bit depth, a high dynamic range, or a wide or custom color gamut, to be compatible with conventional encoders and decoders without significant loss of quality. High quality data is encoded into a plurality of layers with a base layer having the standard quality data and one or more higher quality layers. Decoding systems and methods may map the base layer to the dynamic range or color gamut of the enhancement layer, combine the layers, and map the combined layers to a dynamic range or color gamut appropriate for the target display. Each of the standard quality and the high quality data may be encoded as a plurality of tiers of increasing quality and reference lower level tiers as sources of prediction during predictive coding.