Abstract:
A device, an encoding method, and a decoding method enable a separate marking of base representations and enhanced representations of key access units to save memory or to allow a better scalable video coding optimization. The encoding method of a sequence of original pictures to a sequence of access units includes, after encoding of one of the access units, storing a first decoded picture of the first encoded picture and a second decoded picture of the second encoded picture for inter prediction for encoding of others of the access units; and identifying the first decoded picture and the second decoded picture to be no longer used for inter prediction. The decoding method includes decoding the first access unit, where a first decoded picture is decoded from the first picture and a second decoded picture is decoded from the second picture; marking the first and second decoded pictures as used for inter prediction; decoding the second access unit; and marking one of the first and second decoded pictures as no longer used for inter prediction.
Abstract:
A video coding and decoding method, wherein a picture is first divided into sub-pictures corresponding to one or more subjectively important picture regions and to a background region sub-picture, which remains after the other sub-pictures are removed from the picture. The sub-pictures are formed to conform to predetermined allowable groups of video coding macroblocks (MBs). The allowable groups of MBs can be, for example, of rectangular shape. The picture is then divided into slices so that each sub-picture is encoded independent of other sub-pictures except for the background region sub-picture, which may be coded using another sub-pictures. The slices of the background sub-picture are formed in a scan-order with skipping over MBs that belong to another sub-picture. The background sub-picture is only decoded if all the positions and sizes of all other sub-pictures can be reconstructed on decoding the picture.
Abstract:
An encoder for use in scalable video coding has a mechanism to perform macroblock mode selection for the enhancement layer pictures. The mechanism includes a distortion estimator for each macroblock that reacts to channel errors such as packet losses or errors in video segments affected by error propagation; a Lagrange multiple selector for selecting a weighting factor according to estimated or signaled channel error rate, and a mode decision module or algorithm to choose the optimal mode based on encoding parameters. The mode decision module is configured to select the coding mode based on a sum of the estimated coding distortion and the estimated coding rate multiplied by the weighting factor.
Abstract:
A method of encoding scalable video data having multiple layers where each layer in the multiple layers is associated with at least one other layer includes identifying one or more layers using a first identifier where the first identifier indicates decoding dependency, and identifying reference pictures within the identified one or more layers using a second identifier. The coding of the second identifier for pictures in a first layer is independent of pictures in a second enhancement layer. As such, for all pictures with a certain value of DependencyID, the syntax element frame_num is coded independently of other pictures with different values of DependencyID. Within all pictures with a pre-determined value of DependencyID, a default frame_num coding method is used.
Abstract:
A system and method for providing improved FGS identification in scalable video coding. According to the present invention, each FGS enhancement layer is assigned a unique dependency identifier and contains only FGS enhancement information. For subsequent enhancement layers, the base dependency identifier for the subsequent enhancement layers will point to either a base-quality layer or an FGS enhancement layer.
Abstract:
Quality feedback in a streaming service wherein at least one media stream (101) is streamed to a client (601) and a quality feedback value is determined (304) according to at least one quality metric is improved by determining (305) a timestamp relating to said quality feedback value according to at least one timestamp metric, wherein for each of said at least one quality metrics, a corresponding timestamp metric is defined, and wherein each of said at least one timestamp metrics is based on a relative media playback time of said at least one media stream (101), and reporting (306) said quality feedback value and said related timestamp to a server (600). Said relative media playback time is preferably derived from RTP (102) timestamps, from a NPT provided by a RTSP (109), from timestamps of a RTCP or from timestamps of a SIP.
Abstract:
A signaling method and device for use in stream switching in which GDR random access points are used. In order to indicate the GDR switching points in the bitstreams, a Sync Sample Information Box, which is contained in a Sync Sample Box, is used to provide information of such GDR switching points. The information also includes which slice group is the isolated region and which slice group is the leftover region, if slice groups are applied in encoding. The signaling method can be used in video data transmission using Real-time Transport Protocol (RTP), and a Session Description Protocol (SDP) can be used to convey information indicative of the characteristics of the bitstreams.
Abstract:
In one example, a device for coding video data includes a video coder configured to code, for a bitstream, information representative of which of a plurality of video coding dimensions are enabled for the bitstream, and code values for each of the enabled video coding dimensions, without coding values for the video coding dimensions that are not enabled, in a network abstraction layer (NAL) unit header of a NAL unit comprising video data coded according to the values for each of the enabled video coding dimensions. In this manner, NAL unit headers may have variable lengths, while still providing information for scalable dimensions to which the NAL units correspond.
Abstract:
An improved system and method for implementing efficient decoding of scalable video bitstreams is provided. A virtual decoded picture buffer is provided for each lower layer of the scalable video bitstream. The virtual decoded picture buffer stores decoded lower layer pictures for reference. The decoded lower layer pictures used for reference are compiled to create a reference picture list for each layer. The reference picture list generated by the virtual decoded picture buffer is used during a direct prediction process instead of a target reference list to correctly decode a current macroblock.
Abstract:
A system and method for signaling low-to-high layer switching points in a file format level to enable efficient scalable stream switching in streaming servers and local file playback. The present invention also provides for a system and method for signaling low-to-high layer switching points in video bit stream, e.g., to enable intelligent forwarding of scalability layers in media-aware network elements or computationally scalable decoding in stream recipients.