Abstract:
Embodiments of the present disclosure provide systems and methods for background concealment in a video conferencing session. In one exemplary method, a video stream may be captured and provided to a first terminal participating in a video chat session. A background element and a foreground element may be determined in the video stream. A border region may additionally be determined in the video stream. The border region may define a boundary between the foreground element and the background element. The background region may be modified based, at least in part, on video content of the border region. The modified video stream may be transmitted to a second terminal participating in the video conferencing session.
Abstract:
Techniques are disclosed for selecting deblocking filter parameters in a video decoding system. According to these techniques, a boundary strength parameter may be determined based, at least in part, on a bit depth of decoded video data. Activity of a pair of decoded pixel blocks may be classified based, at least in part, on the determined boundary strength parameter, and when a level of activity indicates that deblocking filtering is to be applied to the pair of pixel blocks, pixel block content at a boundary between the pair of pixel blocks may be filtered using filtering parameters derived at least in part based on the bit depth of the decoded video data. The filtering parameters may decrease strength with increasing bit depth of the decoded video data, which improves quality of the decoded video data.
Abstract:
In video conferencing over a radio network, the radio equipment is a major power consumer especially in cellular networks such as LTE. In order to reduce the radio power consumption in video conferencing, it is important to introduce an enough radio inactive time. Several types of data buffering and bundling can be employed within a reasonable range of latency that doesn't significantly disrupt the real-time nature of video conferencing. In addition, the data transmission can be synchronized to the data reception in a controlled manner, which can result in an even longer radio inactive time and thus take advantage of radio power saving modes such as LTE C-DRX.
Abstract:
Video coders may perform perspective transformation of reference frames during coding in a manner that conserves processing resources. When a new input frame is available for coding, a camera position for the input frame may be estimated. A video coder may search for reference pictures having similar camera positions as the position of the input frame and, for each reference picture identified, the video coder may perform a prediction search to identify a reference picture that is the best prediction match for the input frame. Once the video coder identifies a reference picture to serve as a prediction source for the input frame, the video coder may derive a transform to match the reference frame data to the input frame data and may transform the reference picture accordingly. The video coder may code the input frame using the transformed reference picture as a prediction reference and may transmit coded frame data and the camera position of the input frame to a decoder. Thus, the video coder may perform derivation and execution of transforms on a limited basis which conserves system resources.
Abstract:
A system and method for using camera capture settings and related metadata to estimate the parameters for encoding a frame of the captured video data and to modify reference frames to accommodate detected camera setting changes. Global brightness and color changes in video capture may be modeled by performing a sequence of transform operations on the reference frames to further improve the coding efficiency of a video coding system.
Abstract:
An encoder may include a luma transform, a transformer, and a chroma transform. The luma transform may determine a linear luminance value based upon a plurality of primary color values of a pixel. The transformer may generate a transformed luminance value based upon the linear luminance value and a plurality of transformed color values based upon corresponding more than one of the primary color values of the pixel. The chroma transform may determine a plurality of chroma values based upon corresponding plurality of transformed color values and the transformed luminance value of the pixel.
Abstract:
A method for processing media assets includes, given a first media asset, deriving characteristics from the first media asset, searching for other media assets having characteristics that correlate to the characteristics of the first media asset, when a match is found, deriving content corrections for the first media asset or a matching media asset from the other of the first media asset or the matching media asset, and correcting content of the first media asset or the matching media asset based on the content corrections.
Abstract:
Coding techniques for image data may cause a still image to be converted to a “phantom” video sequence, which is coded by motion compensated prediction techniques. Thus, coded video data obtained from the coding operation may include temporal prediction references between frames of the video sequence. Metadata may be generated that identifies allocations of content from the still image to the frames of the video sequence. The coded data and the metadata may be transmitted to another device, whereupon it may be decoded by motion compensated prediction techniques and converted back to a still image data. Other techniques may involve coding an image in both a base layer representation and at least one coded enhancement layer representation. The enhancement layer representation may be coded predictively with reference to the base layer representation. The coded base layer representation may be partitioned into a plurality of individually-transmittable segments and stored. Prediction references of elements of the enhancement layer representation may be confined to segments of the base layer representation that correspond to a location of those elements. Meaning, when a pixel block of an enhancement layer maps to a given segment of the base layer representation, prediction references are confined to that segment and do not reference portions of the base layer representation that may be found in other segment(s).
Abstract:
An adaptive scaler switching system may implement multiple scalers including both a software scaler and a hardware scaler, and a controller that may manage the switch between scalers by considering the real-time constraints of the system and the available system resources. Information about the availability of system resources may be received in real-time, for example the controller may receive information about the system thermal status, the timing requirements for processing the video data, the quality of the scaled data, and any other relevant system statistics that may affect the scaler switch decision. According to an embodiment, the system may maintain statistics in a table, and update the table information as necessary.
Abstract:
Coding techniques for input video may include assigning picture identifiers to input frames in either long-form or short-form formats. If a network error has occurred that results in loss of previously-coded video data, a new input frame may be assigned a picture identifier that is coded in a long-form coding format. If no network error has occurred, the input frame may be assigned a picture identifier that is coded in a short-form coding format. Long-form coding may mitigate against loss of synchronization between an encoder and a decoder by picture identifiers.