-
公开(公告)号:US20210241426A1
公开(公告)日:2021-08-05
申请号:US16613932
申请日:2017-12-04
Applicant: Google LLC
Inventor: Damien Kelly , Neil Birkbeck , Balineedu Adsumilli , Mohammad Izadi
IPC: G06T5/00
Abstract: A method for denoising video content includes identifying a first frame block associated with a first frame of the video content. The method also includes estimating a first noise model that represents characteristics of the first frame block. The method also includes identifying at least one frame block adjacent to the first frame block. The method also includes generating a second noise model that represents characteristics of the at least one frame block adjacent to the first frame block by adjusting the first noise model based on at least one characteristic of the at least one frame block adjacent to the first frame block. The method also includes denoising the at least one frame block adjacent to the first frame block using the second noise model.
-
12.
公开(公告)号:US12206914B2
公开(公告)日:2025-01-21
申请号:US18021636
申请日:2022-06-08
Applicant: Google LLC
Inventor: Yilin Wang , Balineedu Adsumilli , Junjie Ke , Hossein Talebi , Joong Yim , Neil Birkbeck , Peyman Milanfar , Feng Yang
IPC: H04N21/266 , G06N3/045 , H04N17/02 , H04N19/154 , H04N21/234 , H04N21/434 , H04N21/44 , H04N21/466
Abstract: Methods, systems, and media for determining perceptual quality indicators of video content items are provided. In some embodiments, the method comprises: receiving a video content item; extracting a plurality of frames from the video content item; determining, using a first subnetwork of a deep neural network, a content quality indicator for each frame of the plurality of frames of the video content item; determining, using a second subnetwork of the deep neural network, a video distortion indicator for each frame of the plurality of frames of the video content item; determining, using a third subnetwork of the deep neural network, a compression sensitivity indicator for each frame of the plurality of frames of the video content item; generating a quality level for each frame of the plurality of frames of the video content item that concatenates the content quality indicator, the video distortion indicator, and the compression sensitivity indicator for that frame of the video content item; generating an overall quality level for video content item by aggregating the quality level of each frame of the plurality of frames; and causing a video recommendation to be presented based on the overall quality level of the video content item.
-
公开(公告)号:US20240187618A1
公开(公告)日:2024-06-06
申请号:US18440013
申请日:2024-02-13
Applicant: GOOGLE LLC
Inventor: Sam John , Balineedu Adsumilli , Akshay Gadde
IPC: H04N19/40 , H04N19/119 , H04N19/147 , H04N19/184 , H04N19/192
CPC classification number: H04N19/40 , H04N19/119 , H04N19/147 , H04N19/184 , H04N19/192
Abstract: A learning model is trained for rate-distortion behavior prediction against a corpus of a video hosting platform and used to determine optimal bitrate allocations for video data given video content complexity across the corpus of the video hosting platform. Complexity features of the video data are processed using the learning model to determine a rate-distortion cluster prediction for the video data, and transcoding parameters for transcoding the video data are selected based on that prediction. The rate-distortion clusters are modeled during the training of the learning model, such as based on rate-distortion curves of video data of the corpus of the video hosting platform and based on classifications of such video data. This approach minimizes total corpus egress and/or storage while further maintaining uniformity in the delivered quality of videos by the video hosting platform.
-
公开(公告)号:US11924449B2
公开(公告)日:2024-03-05
申请号:US17908352
申请日:2020-05-19
Applicant: Google LLC
Inventor: Sam John , Balineedu Adsumilli , Akshay Gadde
IPC: H04N19/40 , H04N19/119 , H04N19/147 , H04N19/184 , H04N19/192
CPC classification number: H04N19/40 , H04N19/119 , H04N19/147 , H04N19/184 , H04N19/192
Abstract: A learning model is trained for rate-distortion behavior prediction against a corpus of a video hosting platform and used to determine optimal bitrate allocations for video data given video content complexity across the corpus of the video hosting platform. Complexity features of the video data are processed using the learning model to determine a rate-distortion cluster prediction for the video data, and transcoding parameters for transcoding the video data are selected based on that prediction. The rate-distortion clusters are modeled during the training of the learning model, such as based on rate-distortion curves of video data of the corpus of the video hosting platform and based on classifications of such video data. This approach minimizes total corpus egress and/or storage while further maintaining uniformity in the delivered quality of videos by the video hosting platform.
-
公开(公告)号:US20240022726A1
公开(公告)日:2024-01-18
申请号:US17862571
申请日:2022-07-12
Applicant: GOOGLE LLC
Inventor: Yilin Wang , Balineedu Adsumilli
CPC classification number: H04N19/13 , G06N20/00 , G06K9/6256
Abstract: A training dataset that includes a first dataset and a second dataset is received. The first dataset includes a first subset of first videos corresponding to a first context and respective first ground truth quality scores of the first videos, and the second dataset includes a second subset of second videos corresponding to a second context and respective second ground truth quality scores of the second videos. A machine learning model is trained to predict the respective first ground truth quality scores and the respective second ground truth quality scores. Training the model includes training it to obtain a global quality score for one of the videos; and training it to map the global quality score to context-dependent predicted quality scores. The context-dependent predicted quality scores include a first context-dependent predicted quality score corresponding to the first context and a second context-dependent predicted quality score corresponding to the second context.
-
公开(公告)号:US11854164B2
公开(公告)日:2023-12-26
申请号:US17708983
申请日:2022-03-30
Applicant: Google LLC
Inventor: Damien Kelly , Neil Birkbeck , Balineedu Adsumilli , Mohammad Izadi
IPC: G06T5/00
CPC classification number: G06T5/002 , G06T2207/10016 , G06T2207/20021
Abstract: Processing a spherical video using denoising is described. Video content comprising the spherical video is received. Whether a camera geometry or a map projection, or both, used to generate the spherical video is available is then determined. The spherical video is denoised using a first technique responsive to a determination that the camera geometry, the map projection, or both is available. Otherwise, the spherical video is denoised using a second technique. At least some steps of the second technique can be different from steps of the first technique. The denoised spherical video can be encoded for transmission or storage using less data than encoding the spherical video without denoising.
-
公开(公告)号:US11843814B2
公开(公告)日:2023-12-12
申请号:US17462286
申请日:2021-08-31
Applicant: Google LLC
Inventor: Neil Birkbeck , Balineedu Adsumilli , Damien Kelly
IPC: H04N21/2662 , H04N21/233 , H04N21/234 , H04N21/4728 , H04N21/81
CPC classification number: H04N21/2662 , H04N21/233 , H04N21/23418 , H04N21/4728 , H04N21/816
Abstract: Signals of an immersive multimedia item are jointly considered for optimizing the quality of experience for the immersive multimedia item. During encoding, portions of available bitrate are allocated to the signals (e.g., a video signal and an audio signal) according to the overall contribution of those signals to the immersive experience for the immersive multimedia item. For example, in the spatial dimension, multimedia signals are processed to determine spatial regions of the immersive multimedia item to render using greater bitrate allocations, such as based on locations of audio content of interest, video content of interest, or both. In another example, in the temporal dimension, multimedia signals are processed in time intervals to adjust allocations of bitrate between the signals based on the relative importance of such signals during those time intervals. Other techniques for bitrate optimizations for immersive multimedia streaming are also described herein.
-
公开(公告)号:US20230305800A1
公开(公告)日:2023-09-28
申请号:US18327134
申请日:2023-06-01
Applicant: GOOGLE LLC
Inventor: Marcin Gorzel , Balineedu Adsumilli
Abstract: First video frames that include a visual object and a non-spatialized first audio segment that includes an auditory event are received. If that second video frames do not include the visual object and a first time difference between the first video frames and the second video frames does not exceed a certain time, a motion vector of the visual object is used to assign a spatial location to the auditory event in at least one of the second video frames. A second audio segment that includes the auditory event and third video frames are received. If the third video frames do not include the visual object and a second time difference between the first video frames and the third video frames exceeds the certain time, the auditory event is assigned to a diffuse sound field. An audio output that conveys spatial locations of the visual object is output.
-
公开(公告)号:US11748854B2
公开(公告)日:2023-09-05
申请号:US17722720
申请日:2022-04-18
Applicant: Google LLC
Inventor: Neil Birkbeck , Balineedu Adsumilli , Mohammad Izadi
IPC: G06T5/00
CPC classification number: G06T5/002 , G06T2207/10016 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084
Abstract: Denoising video content includes identifying a three-dimensional flat frame block of multiple frames of the video content, wherein the three-dimensional flat frame block comprises flat frame blocks, each flat frame block is located within a respective frame of the multiple frames, and the flat frame blocks have a spatial and temporal intensity variance that is less than a threshold. Denoising video content also includes determining an average intensity value of the three-dimensional flat frame block, determining a noise model that represents noise characteristics of the three-dimensional flat frame block, generating a denoising function using the average intensity value and the noise model, and denoising the multiple frames using the denoising function.
-
公开(公告)号:US20220078446A1
公开(公告)日:2022-03-10
申请号:US17416235
申请日:2019-04-25
Applicant: Google LLC
Inventor: Mohammad Izadi , Balineedu Adsumilli
IPC: H04N19/147 , H04N19/117 , H04N19/82 , H04N19/105 , H04N19/176
Abstract: Adaptive filtering is used video stream for bitrate reduction. A first copy of the input video stream is encoded to a reference bitstream. Each of a number of candidate filters is applied to each frame of a second copy of the input video stream to produce a filtered second copy of the input video stream. The filtered second copy is encoded to a candidate bitstream. A cost value for the candidate filter is determined based on distortion value and bitrate differences between the candidate bitstream and the reference bitstream. The candidate bitstream corresponding to the candidate filter with a lowest one of the cost values is selected as the output bitstream, which is then output or stored. Processing the input video stream using the adaptive filter and before the encoding may result in bitrate reduction, thereby improving compression, decompression, and other performance.
-
-
-
-
-
-
-
-
-