Abstract:
Techniques are described for implementing format configurations for multi-directional video and for switching between them. Source images may be assigned to formats that may change during a coding session. When a change occurs between formats, video coders and decoder may transform decoded reference frames from the first format to the second format. Thereafter, new frames in the second configuration may be coded or decoded predictively using transformed reference frame(s) as source(s) of prediction. In this manner, video coders and decoders may use intra-coding techniques and achieve high efficiency in coding.
Abstract:
Embodiments of the present disclosure provide systems and methods for perspective shifting in a video conferencing session. In one exemplary method, a video stream may be generated. A foreground element may be identified in a frame of the video stream and distinguished from a background element of the frame. Data may be received representing a viewing condition at a terminal that will display the generated video stream. The frame of the video stream may be modified based on the received data to shift of the foreground element relative to the background element. The modified video stream may be displayed at the displaying terminal.
Abstract:
Multi-directional image data often contains distortions of image content that cause problems when processed by video coders that are designed to process traditional, “flat” image content. Embodiments of the present disclosure provide techniques for coding multi-directional image data using such coders. For each pixel block in a frame to be coded, an encoder may transform reference picture data within a search window about a location of the input pixel block based on displacement respectively between the location of the input pixel block and portions of the reference picture within the search window. The encoder may perform a prediction search among the transformed reference picture data to identify a match between the input pixel block and a portion of the transformed reference picture and, when a match is identified, the encoder may code the input pixel block differentially with respect to the matching portion of the transformed reference picture. The transform may counter-act distortions imposed on image content of the reference picture data by the multi-directional format, which aligns the content with image content of the input picture. The techniques apply both for intra-coding and inter-coding.
Abstract:
A scalable coding system codes video as a base layer representation and an enhancement layer representation. A base layer coder may code an LDR representation of a source video. A predictor may predict an HDR representation of the source video from the coded base layer data. A comparator may generate prediction residuals which represent a difference between an HDR representation of the source video and the predicted HDR representation of the source video. A quantizer may quantize the residuals down to an LDR representation. An enhancement layer coder may code the LDR residuals. In other scalable coding systems, the enhancement layer coder may code LDR-converted HDR video directly.
Abstract:
Embodiments of the present disclosure provide systems and methods for background concealment in a video conferencing session. In one exemplary method, a video stream may be captured and provided to a first terminal participating in a video chat session. A background element and a foreground element may be determined in the video stream. A border region may additionally be determined in the video stream. The border region may define a boundary between the foreground element and the background element. The background region may be modified based, at least in part, on video content of the border region. The modified video stream may be transmitted to a second terminal participating in the video conferencing session.
Abstract:
A system and method for using camera capture settings and related metadata to estimate the parameters for encoding a frame of the captured video data and to modify reference frames to accommodate detected camera setting changes. Global brightness and color changes in video capture may be modeled by performing a sequence of transform operations on the reference frames to further improve the coding efficiency of a video coding system.
Abstract:
An encoder may include a luma transform, a transformer, and a chroma transform. The luma transform may determine a linear luminance value based upon a plurality of primary color values of a pixel. The transformer may generate a transformed luminance value based upon the linear luminance value and a plurality of transformed color values based upon corresponding more than one of the primary color values of the pixel. The chroma transform may determine a plurality of chroma values based upon corresponding plurality of transformed color values and the transformed luminance value of the pixel.
Abstract:
A method for processing media assets includes, given a first media asset, deriving characteristics from the first media asset, searching for other media assets having characteristics that correlate to the characteristics of the first media asset, when a match is found, deriving content corrections for the first media asset or a matching media asset from the other of the first media asset or the matching media asset, and correcting content of the first media asset or the matching media asset based on the content corrections.
Abstract:
Coding techniques for image data may cause a still image to be converted to a “phantom” video sequence, which is coded by motion compensated prediction techniques. Thus, coded video data obtained from the coding operation may include temporal prediction references between frames of the video sequence. Metadata may be generated that identifies allocations of content from the still image to the frames of the video sequence. The coded data and the metadata may be transmitted to another device, whereupon it may be decoded by motion compensated prediction techniques and converted back to a still image data. Other techniques may involve coding an image in both a base layer representation and at least one coded enhancement layer representation. The enhancement layer representation may be coded predictively with reference to the base layer representation. The coded base layer representation may be partitioned into a plurality of individually-transmittable segments and stored. Prediction references of elements of the enhancement layer representation may be confined to segments of the base layer representation that correspond to a location of those elements. Meaning, when a pixel block of an enhancement layer maps to a given segment of the base layer representation, prediction references are confined to that segment and do not reference portions of the base layer representation that may be found in other segment(s).
Abstract:
An adaptive scaler switching system may implement multiple scalers including both a software scaler and a hardware scaler, and a controller that may manage the switch between scalers by considering the real-time constraints of the system and the available system resources. Information about the availability of system resources may be received in real-time, for example the controller may receive information about the system thermal status, the timing requirements for processing the video data, the quality of the scaled data, and any other relevant system statistics that may affect the scaler switch decision. According to an embodiment, the system may maintain statistics in a table, and update the table information as necessary.