Abstract:
Some embodiments provide a method for conducting a video conference between a first mobile device and a second device. The first mobile device includes first and second cameras. The method selects the first camera for capturing images. The method transmits images captured by the first camera to the second device. The method receives selections of the second camera for capturing images during the video conference. The method terminates the transmission of images captured by the first camera and transmits images captured by the second camera of the first mobile device to the second device during the video conference.
Abstract:
Video coders and decoders perform transform coding and decoding on blocks of video content according to an adaptively selected transform type. The transform types are organized into a hierarchy of transform sets where each transform set includes a respective number of transforms and each higher-level transform set includes the transforms of each lower-level transform set within the hierarchy. The video coders and video decoders may exchange signaling that establishes a transform set context from which a transform set that was selected for coding given block(s) may be identified. The video coders and video decoders may exchange signaling that establishes a transform decoding context from which a transform that was selected from the identified transform set to be used for decoding the transform unit. The block(s) may be coded and decoded by the selected transform.
Abstract:
Systems and methods disclosed for video compression, utilizing neural networks for predictive video coding. Processes employed combine multiple banks of neural networks with codec system components to carry out the coding and decoding of video data.
Abstract:
Techniques are disclosed by which a coding parameter is determined to encode video data resulting in encoded video data possessing a highest possible video quality. Features may be extracted from an input video sequence. The extracted features may be compared to features described in a model of coding parameters generated by a machine learning algorithm from reviews of previously-coded videos, extracted features of the previously-coded videos, and coding parameters of the previously-coded videos. When a match is detected between the extracted features of the input video sequence and extracted features represented in the model, a determination may be made as to whether coding parameters that correspond to the matching extracted feature correspond to a tier of service to which the input video sequence is to be coded. When the coding parameters that correspond to the matching extracted feature correspond to the tier of service to which the input video sequence is to be coded, the input video sequence may be coded according to the coding parameters.
Abstract:
The present disclosure describes techniques for efficient coding of motion vectors developed for multi-hypothesis coding applications. According to these techniques, when coding hypotheses are developed, each having a motion vector identifying a source of prediction for a current pixel block, a motion vector for a first one of the coding hypotheses may be predicted from the motion vector of a second coding hypothesis. The first motion vector may be represented by coding a motion vector residual, which represents a difference between the developed motion vector for the first coding hypothesis and the predicted motion vector for the first coding hypothesis, and outputting the coded residual to a channel. In another embodiment, a motion vector residual may be generated for a motion vector of a first coding hypothesis, and the first motion vector and the motion vector residual may be used to predict a second motion vector and a predicted motion vector residual. The second hypothesis's motion vector may be coded as a difference between the motion vector, the predicted second motion vector, and the predicted motion vector residual. In a further embodiment, a single motion vector residual may be output for the motion vectors of two coding hypotheses representing a difference between the motion vector of one of the hypotheses and a predicted motion vector for that hypothesis.
Abstract:
The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.
Abstract:
The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.
Abstract:
A system comprises an encoder configured to compress attribute and/or spatial information for a point cloud and/or a decoder configured to decompress compressed attribute and/or spatial information for the point cloud. To compress the attribute and/or spatial information, the encoder is configured to convert a point cloud into an image based representation. Also, the decoder is configured to generate a decompressed point cloud based on an image based representation of a point cloud. In some embodiments, an encoder performs downscaling of an image frame prior to video encoding and a decoder performs upscaling of an image frame subsequent to video decoding.
Abstract:
A system obtains a data set representing immersive video content for display at a display time, including first data representing the content according to a first level of detail, and second data representing the content according to a second higher level of detail. During one or more first times prior to the display time, the system causes at least a portion of the first data to be stored in a buffer. During one or more second times prior to the display time, the system generates a prediction of a viewport for displaying the content to a user at the display time, identifies a portion of the second data corresponding to the prediction of the viewport, and causes the identified portion of the second data to be stored in the video buffer. At the display time, the system causes the content to be displayed to the user using the video buffer.
Abstract:
The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.