Abstract:
Clustering algorithms such as k-means clustering algorithm are used in applications that process entities with spatial and/or temporal characteristics, for example, media objects representing audio, video, or graphical data. Feature vectors representing characteristics of the entities are partitioned using clustering methods that produce results sensitive to an initial set of cluster seeds. The set of initial cluster seeds is generated using principal component analysis of either the complete feature vector set or a subset thereof. The feature vector set is divided into a desired number of initial clusters and a seed determined from each initial cluster.
Abstract:
A camera captures an image of a user wearing a head mounted device (HMD) that occludes a portion of the user's face. A three-dimensional (3-D) pose that indicates an orientation and a location of the user's face in a camera coordinate system is determined. A representation of the occluded portion of the user's face is determined based on a 3-D model of the user's face. The representation replaces a portion of the HMD in the image based on the 3-D pose of the user's face in the camera coordinate system. In some cases, the 3-D model of the user's face is selected from 3-D models of the user's face stored in a database that is indexed by eye gaze direction. Mixed reality images can be generated by combining virtual reality images, unoccluded portions of the user's face, and representations of an occluded portion of the user's face.
Abstract:
A highlight learning technique is provided to detect and identify highlights in sports videos. A set of event models are calculated from low-level frame information of the sports videos to identify recurring events within the videos. The event models are used to characterize videos by detecting events within the videos and using the detected events to generate an event vector. The event vector is used to train a classifier to identify the videos as highlight or non-highlight.
Abstract:
A method, computer program product, and computer system for identifying a first portion of a facial image in a first image, wherein the first portion includes noise. A corresponding portion of the facial image is identified in a second image, wherein the corresponding portion includes less noise than the first portion. One or more filter parameters of the first portion are determined based upon, at least in part, the first portion and the corresponding portion. At least a portion of the noise from the first portion is smoothed based upon, at least in part, the one or more filter parameters. At least a portion of face specific details from the corresponding portion is added to the first portion.
Abstract:
Implementations generally relate to generating compositional media content. In some implementations, a method includes receiving a plurality of photos from a user, and determining one or more composition types from the photos. The method also includes generating compositions from the selected photos based on the one or more determined composition types. The method also includes providing the one or more generated compositions to the user.
Abstract:
An easy-to-use online video stabilization system and methods for its use are described. Videos are stabilized after capture, and therefore the stabilization works on all forms of video footage including both legacy video and freshly captured video. In one implementation, the video stabilization system is fully automatic, requiring no input or parameter settings by the user other than the video itself. The video stabilization system uses a cascaded motion model to choose the correction that is applied to different frames of a video. In various implementations, the video stabilization system is capable of detecting and correcting high frequency jitter artifacts, low frequency shake artifacts, rolling shutter artifacts, significant foreground motion, poor lighting, scene cuts, and both long and short videos.
Abstract:
In some instances, an image may have dimensions that do not correspond to a slot to display the image. For example, an image content item may have dimensions that do not correspond to a content item slot. The image may be resized using seam carving to add or remove pixels of the image. A saliency map for the image may be used having saliency scores for each pixel of the image. Evaluation metrics may be used before, during, and after, seam carving to determine whether salient content is affected by the seam carving. In some instances, a seam cost threshold value may be used for adaptive step size during the seam carving. The resized image may then be outputted, such as for an image content item to be served with a resource.
Abstract:
An easy-to-use online video stabilization system and methods for its use are described. Videos are stabilized after capture, and therefore the stabilization works on all forms of video footage including both legacy video and freshly captured video. In one implementation, the video stabilization system is fully automatic, requiring no input or parameter settings by the user other than the video itself. The video stabilization system uses a cascaded motion model to choose the correction that is applied to different frames of a video. In various implementations, the video stabilization system is capable of detecting and correcting high frequency jitter artifacts, low frequency shake artifacts, rolling shutter artifacts, significant foreground motion, poor lighting, scene cuts, and both long and short videos.
Abstract:
Methods and systems for video retargeting and view selection using motion saliency are described. Salient features in multiple videos may be extracted. Each video may be retargeted by modifying the video to preserve the salient features. A crop path may be estimated and applied to a video to retarget each video and generate a modified video preserving the salient features. An action score may be assigned to portions or frames of each modified video to represent motion content in the modified video. Selecting a view from one of the given modified videos may be formulated as an optimization subject to constraints. An objective function for the optimization may include maximizing the action score. This optimization may also be subject to constraints to take into consideration optimal transitioning from a view from a given video to another view from another given video, for example.
Abstract:
A method, computer program product, and computer system for identifying a first portion of a facial image in a first image, wherein the first portion includes noise. A corresponding portion of the facial image is identified in a second image, wherein the corresponding portion includes less noise than the first portion. One or more filter parameters of the first portion are determined based upon, at least in part, the first portion and the corresponding portion. At least a portion of the noise from the first portion is smoothed based upon, at least in part, the one or more filter parameters. At least a portion of face specific details from the corresponding portion is added to the first portion.