Abstract:
Described herein are techniques related to noise reduction for image sequences or videos. This Abstract is submitted with the understanding that it will not be used to interpret or limit the scope and meaning of the claims. A noise reduction tool includes a motion estimator configured to estimated motion in the video, a noise spectrum estimator configured to estimate noise in the video, a shot detector configured to trigger the noise estimation process, a noise spectrum validator configured to validate the estimated noise spectrum, and a noise reducer to reduce noise in the video using the estimated noise spectrum.
Abstract:
Methods and systems are provided for using a model of human speech quality perception to provide an objective measure for predicting subjective quality assessments. A Virtual Speech Quality Objective Listener (ViSQOL) model is a signal-based full-reference metric that uses a spectro-temporal measure of similarity between a reference signal and test speech signal. Specifically, the model provides for the ability to detect and predict the level of clock drift, and determine whether such clock drift will impact a listener's quality of experience.
Abstract:
Methods and systems are provided for detecting chop in an audio signal. A time-frequency representation, such as a spectrogram, is created for an audio signal and used to calculate a gradient of mean power per frame of the audio signal. Positive and negative gradients are defined for the signal based on the gradient of mean power, and a maximum overlap offset between the positive and negative gradients is determined by calculating a value that maximizes the cross-correlation of the positive and negative gradients. The negative gradient values may be combined (e.g., summed) with the overlap offset, and the combined values then compared with a threshold to estimate the amount of chop present in the audio signal. The chop detection model provided is low-complexity and is applicable to narrowband, wideband, and superwideband speech.
Abstract:
Implementations disclose mutual noise estimation for videos. A method includes determining an optimal frame noise variance for intensity values of each frame of frames of a video, the optimal frame noise variance based on a determined relationship between spatial variance and temporal variance of the intensity values of homogeneous blocks in the frame, identifying an optimal video noise variance for the video based on optimal frame noise variances of the frames of the video, selecting, for each frame of the video, one or more of the blocks having a spatial variance that is less than the optimal video noise variance, the one or more frames selected as the homogeneous blocks, and utilizing the selected homogeneous blocks to estimate a noise signal of the video.
Abstract:
Implementations generally relate to enhancing content appearance. In some implementations, a method includes receiving an image, selecting a reference object in the image. The method also includes determining one or more image parameter adjustments based on the selected reference object, and applying the one or more image parameter adjustments to the entire image.
Abstract:
A method for determining the position of multiple cameras relative to each other includes at a processor, receiving video data from at least one video recording taken by each camera; selecting a subset of frames of each video recording, including determining relative blurriness of each frame of each video recording, selecting frames having a lowest relative blurriness, counting features points in each of the lowest relative blurriness frames, and selecting for further analysis, lowest relative blurriness frames having a highest count of feature points; and processing each selected subset of frames from each video recording to estimate the location and orientation of each camera.
Abstract:
A system for video stabilization is provided. The system includes a media component, a transformation component, an offset component and a zoom component. The media component receives a video sequence including at least a first video frame and a second video frame. The transformation component calculates at least a first motion parameter associated with translational motion for the first video frame and at least a second motion parameter associated with the translational motion for the second video frame. The offset component subtracts an offset value generated as a function of a maximum motion parameter and a minimum motion parameter from the first motion parameter and the second motion parameter to generate a set of modified motion parameters. The zoom component determines a zoom value for the video sequence based at least in part on the set of modified motion parameters.
Abstract:
Implementations disclose bitrate optimization for multi-representation encoding using playback statistics. A method includes generating multiple versions of a segment of a source video, the versions comprising encodings of the segment at different encoding bitrates for each resolution of the segment, measuring a quality metric for each version of the segment, generating rate-quality models for each resolution of the segment based on the measured quality metrics corresponding to the resolutions, generating a probability model to predict requesting probabilities that representations of the segment are requested, the probability model based on a joint probability distribution of network speed and viewport size that is generated from client-side feedback statistics associated with prior playbacks of other videos, determining an encoding bitrate for each of the representations of the segment based on the rate-quality models and the probability model, and assigning determined encoding bitrates to corresponding representations of the segment.
Abstract:
Implementations disclose mutual noise estimation for videos. A method includes determining an optimal frame noise variance for intensity values of each frame of frames of a video, the optimal frame noise variance based on a determined relationship between spatial variance and temporal variance of the intensity values of homogeneous blocks in the frame, identifying an optimal video noise variance for the video based on optimal frame noise variances of the frames of the video, selecting, for each frame of the video, one or more of the blocks having a spatial variance that is less than the optimal video noise variance, the one or more frames selected as the homogeneous blocks, and utilizing the selected homogeneous blocks to estimate a noise signal of the video.
Abstract:
A method for pull frame interpolation includes receiving an encoded bitstream including information representing a plurality of frames of video data, decoding the plurality of frames, including identifying a plurality of motion vectors indicating motion from a first frame of the plurality of video frames to a second frame of the plurality of video frames, identifying an interpolation point between the first frame and the second frame, identifying a plurality of candidate interpolation motion vectors indicating motion from the first frame to the interpolation point and from the second frame to the interpolation point based on the plurality of motion vectors, selecting an interpolation motion vector from the plurality of candidate interpolation motion vectors based on a metric, and generating an interpolated frame at the interpolation point based on the selected interpolation motion vector, which may include correcting an artifact in the interpolated frame by blending the interpolated frame.