Methods and systems for an efficient inter-prediction structure and signaling for low-delay video streaming

    公开(公告)号:US12113962B1

    公开(公告)日:2024-10-08

    申请号:US17886111

    申请日:2022-08-11

    CPC classification number: H04N19/105 H04N19/172 H04N19/577 H04N19/159

    Abstract: Techniques for an efficient inter-prediction structure and signaling for low-delay streaming of live video are described. According to some examples, a computer-implemented method includes receiving a live video at a content delivery service, determining a subset of candidate reference frames from a plurality of frames received of the live video, generating an identification code, for the subset of candidate reference frames, having a multiple-bit format that includes a first bit value to indicate a corresponding candidate reference frame is a reference frame for an input frame from the live video and a second bit value to indicate the corresponding candidate reference frame is not the reference frame for the input frame from the live video, and, when a bit of the identification code for a first candidate reference frame is set to the first bit value to indicate the first candidate reference frame is one of a forward reference frame and a backward reference frame for the input frame from the live video, an immediately following bit of the identification code being set to the first bit value indicates the first candidate reference frame is also another of the one of the forward reference frame and the backward reference frame for the input frame from the live video, performing a real time encode of the input frame of the live video based at least in part on the identification code to generate an encoded frame by the content delivery service, and transmitting the encoded frame from the content delivery service to a viewer device.

    COMPUTER-IMPLEMENTED MULTI-SCALE MACHINE LEARNING MODEL FOR THE ENHANCEMENT OF COMPRESSED VIDEO

    公开(公告)号:US20240236345A1

    公开(公告)日:2024-07-11

    申请号:US18186084

    申请日:2023-03-17

    Abstract: The present disclosure relates to methods, apparatus, systems, and non-transitory computer-readable storage media for training and using a multi-scale machine learning model for the enhancement of compressed video. According to some examples, a computer-implemented method includes receiving a video at a content delivery service; performing an encode on a frame of the video by the content delivery service that coverts the frame from a pixel domain to a transform domain and back to the pixel domain to generate first pixel values and a first residual for a block of the frame at a first resolution; generating a first set of features, by a machine learning model of the content delivery service, for an input, at a first resolution, of the first pixel values and the first residual of the block; generating a second set of features, by the machine learning model of the content delivery service, for an input, at a second lower resolution, of second pixel values and a second residual of the block; upsampling the second set of features to the first resolution to generate an upsampled second set of features; generating a modified version of the frame based on the first set of features and the upsampled second set of features; and transmitting the modified version of the frame to a frame buffer or from the content delivery service to a viewer device.

Patent Agency Ranking