摘要:
This invention discloses a deep reinforcement learning based adaptive bitrate selection method and system for real-time streaming, where deep reinforcement learning neural networks are utilized to receive states observations and make bitrate decisions. Simulation is constructed to provide network states including network QoS and playback status to agents and compute accumulated rewards according to the bitrate actions made by agents. ARS balances a variety of QoE goals to determine the accumulated rewards. ARS also enables multiple agents to be trained concurrently and conducts training process in a simulation environment to accelerate the training speed. In addition, ARS supports training ABR algorithm both online and offline.
摘要:
Interactive-streaming-based applications, such as cloud gaming, giga-pixel streaming and virtual reality, have rigorous requirements on the network latency, which can be satisfied by routing users' requests over an overlay network. Existing overlay routing strategies suffer from high deployment and maintenance costs. An optimized cloud overlay routing system and method is discussed herein, which maximizes the number of user requests for the interactive-streaming-based applications to be served, lower the deployment and maintenance costs for the overlay services, reduce the overall network delay, and balance the network loads by bypassing busy underlay links.
摘要:
In a collaborative video processing method and system, a high resolution video input is optionally downscaled to a low resolution video using a down-sampling filter, followed by an end-to-end video coding system to encode the low resolution video for streaming over the Internet. The original high resolution is obtained at the client end by upscaling the low resolution video using a deep learning based high resolution scaling model, which can be trained in a pre-defined progressive order with low resolution videos having different compression parameters and downscaling factors.
摘要:
A method and apparatus for the single input multiple output based media adaptation is disclosed. In one embodiment, such adaption is performed in two steps. On step 1, content correlation between different compression schemes is used to perform the inter-format adaptation of a stream of a compression format to an intermediate output stream of another compression scheme with the same quality level. On step 2, content correlation between different quality levels is used to perform the intra-format adaptation of the intermediate output stream to multiple output streams at different quality levels with the same compression format. In one embodiment, content correlation is used to limit the search for mode candidates when performing both steps.
摘要:
Methods and apparatus are provided for motion compensation with a smooth reference frame in bit depth scalability. An apparatus includes an encoder for encoding picture data for at least a portion of a picture by generating an inter-layer residue prediction for the portion using an inverse tone mapping operation performed in the pixel domain for bit depth scalability. The inverse tone mapping operation is shifted from a residue domain to the pixel domain.
摘要:
Methods and apparatus are provided for motion compensation with a smooth reference frame in bit depth scalability. An apparatus includes an encoder for encoding picture data for at least a portion of a picture by generating an inter-layer residue prediction for the portion using an inverse tone mapping operation performed in the pixel domain for bit depth scalability. The inverse tone mapping operation is shifted from a residue domain to the pixel domain.
摘要:
An apparatus and method for video fingerprinting are provided. The method includes, for each frame of a video sequence including a plurality of frames, removing a portion of the frame, dividing a remaining portion of the frame into blocks, dividing each block into sub-blocks, computing a block level feature as a mean of pixels in each sub-block within the block, concatenating all block level features in the frame, and concatenating features of all frames in the video sequence.
摘要:
In a video encoder, pixel values of a macro-block are processed to determine an activity measure indicative of the type of content in the macro-block. Several techniques are employed for determining the activity measure of a macro-block. In an embodiment, a default quantization scale for quantizing a macro-block is modified based on the activity measure of the macro-block. In another embodiment, the macro-block is classified into one of multiple classes based on its activity measure. The default quantization scale for quantizing the macro-block is modified based on the classification of the macro-block. In yet another embodiment, an encoding mode to be used for encoding a macro-block is also determined on the basis of the class of the macro-block. Several of the techniques exploit the fact that the human visual system (HVS) has different sensitivities in perceiving a (rendered) macro-block or video frame, depending on the type of macro-block content.