Abstract:
Different candidate image data feature types are evaluated to identify one or more specific image data feature types to be used in training a prediction model for optimizing one or more image metadata parameters. A plurality of image data features of the one or more selected image data feature types is extracted from one or more images. The plurality of image data features of the one or more selected image data feature types is reduced into a plurality of significant image data features. A total number of image data features in the plurality of significant image data features is no larger than a total number of image data features in the plurality of image data features of the one or more selected image data feature types. The plurality of significant image data features is applied to training the prediction model for optimizing one or more image metadata parameters.
Abstract:
In a method to reconstruct a high dynamic range video signal, a decoder receives parameters in the input bitstream to generate a prediction function. Using the prediction function, it generates a first set of nodes for a first prediction lookup table, wherein each node is characterized by an input node value and an output node value. Then, it modifies the output node values of one or more of the first set of nodes to generate a second set of nodes for a second prediction lookup table, and generates output prediction values using the second lookup table. Low-complexity methods to modify the output node value of a current node in the first set of nodes based on computing modified slopes between the current node and nodes surrounds the current node are presented.
Abstract:
Novel methods and systems for encoding standard dynamic range video to improve the final quality after converting standard dynamic range video into enhanced dynamic range video are disclosed. A dual layer codec structure that amplifies certain codeword ranges can be used to send enhanced information to the decoder in order to achieve an enhanced (higher bit depth) image signal. The enhanced standard dynamic range signal can then be up-converted to enhanced dynamic range video without banding artifacts in the areas corresponding to those certain codeword ranges.
Abstract:
Pixel data of a video sequence with enhanced dynamic range (EDR) are predicted based on pixel data of a corresponding video sequence with standard dynamic range (SDR) and an inter-layer predictor. Under a highlights clipping constrain, conventional SDR to EDR prediction is adjusted as follows: a) given a highlights threshold, the SDR to EDR predictor is adjusted to output a fixed output value for all input SDR pixel values larger than the highlights threshold, and b) given a dark-regions threshold, the residual values between the input EDR signal and its predicted value are set to zero for all input SDR pixel values lower than the dark-regions threshold. Example processes to determine the highlights and dark-regions thresholds and whether highlights clipping is occurring are provided.
Abstract:
Coding syntaxes in compliance with same or different VDR specifications may be signaled by upstream coding devices such as VDR encoders to downstream coding devices such as VDR decoders in a common vehicle in the form of RPU data units. VDR coding operations and operational parameters may be specified as sequence level, frame level, or partition level syntax elements in a coding syntax. Syntax elements in a coding syntax may be coded directly in one or more current RPU data units under a current RPU ID, predicted from other partitions/segments/ranges previously sent with the same current RPU ID, or predicted from other frame level or sequence level syntax elements previously sent with a previous RPU ID. A downstream device may perform decoding operations on multi-layered input image data based on received coding syntaxes to construct VDR images.
Abstract:
An image processing device receives one or more forward reshaped images that are generated by an image forward reshaping device from one or more wide dynamic range images based on a forward reshaping function. The forward reshaping function relates to a backward reshaping function. The image processing device performs one or more image transform operations on the one or more forward reshaped images to generate one or more processed forward reshaped images without performing backward reshaping operations on the one or more reshaped images or the one or more processed forward reshaped images based on the backward reshaping function. The one or more processed forward reshaped images are sent to a second image processing device.
Abstract:
A sequence of visual dynamic range (VDR) images is encoded using a standard dynamic range (SDR) base layer and one or more enhancement layers. A predicted VDR image is generated from an SDR input by using a weighted, multi-band, cross-color channel prediction model. Exponential weights with an adaptable decay parameter for each band are also presented.
Abstract:
An encoder receives an input enhanced dynamic range (EDR) image to be stored or transmitted using multiple coding formats in a layered representation. A layer decomposer generates a lower dynamic range (LDR) image from the EDR image. One or more base layer (BL) encoders encode the LDR image to generate a main coded BL stream and one or more secondary coded BL streams, where each secondary BL stream is coded in a different coding format than the main coded BL stream. A single enhancement layer (EL) coded stream and related metadata are generated using the main coded BL stream, the LDR image, and the input EDR image. An output coded stream includes the coded EL stream, the metadata, and either the main coded BL stream or one of the secondary coded BL streams. Computation-scalable decoding and display management processes for EDR images are also described.
Abstract:
A sequence of visual dynamic range (VDR) images is encoded using a standard dynamic range (SDR) base layer and one or more enhancement layers. A predicted VDR image is generated from an SDR input by using a weighted, multi-band, cross-color channel prediction model. Exponential weights with an adaptable decay parameter for each band are also presented.
Abstract:
Sample data and metadata related to spatial regions in images may be received from a coded video signal. It is determined whether specific spatial regions in the images correspond to a specific region of luminance levels. In response to determining the specific spatial regions correspond to the specific region of luminance levels, signal processing and video compression operations are performed on sets of samples in the specific spatial regions. The signal processing and video compression operations are at least partially dependent on the specific region of luminance levels.