Abstract:
Techniques are disclosed for adding interactive features to videos to enable users to create new media using a dynamic blend of motion and still imagery. The interactive techniques can include allowing a user to change the starting time of one or more subjects in a given video frame, or only animate/play a portion of a given frame scene. The techniques may include segmenting each frame of a video to identify one or more subjects within each frame, selecting (or receiving selections of) one or more subjects within the given frame scene, tracking the selected subject(s) from frame to frame, and alpha-matting to play/animate only the selected subject(s). In some instances, segmentation, selection, and/or tracking may be improved and/or enhanced using pixel depth information (e.g., using a depth map).
Abstract:
A mechanism is described for facilitating cinematic space-time view synthesis in computing environments according to one embodiment. A method of embodiments, as described herein, includes capturing, by one or more cameras, multiple images at multiple positions or multiple points in times, where the multiple images represent multiple views of an object or a scene, where the one or more cameras are coupled to one or more processors of a computing device. The method further includes synthesizing, by a neural network, the multiple images into a single image including a middle image of the multiple images and representing an intermediary view of the multiple views.
Abstract:
An example system for generating light field content is described herein. The system includes a receiver to receive a plurality of images and a calibrator to intrinsically calibrate a camera. The system also includes a corrector and projector undistort the images and project the undistorted images to generate undistorted rectilinear images. An extrinsic calibrator may rectify and align the undistorted rectilinear images. Finally, the system includes a view interpolator to perform intermediate view interpolation on the rectified and aligned undistorted rectilinear images.
Abstract:
A mechanism is described for facilitating cinematic space-time view synthesis in computing environments according to one embodiment. A method of embodiments, as described herein, includes capturing, by one or more cameras, multiple images at multiple positions or multiple points in times, where the multiple images represent multiple views of an object or a scene, where the one or more cameras are coupled to one or more processors of a computing device. The method further includes synthesizing, by a neural network, the multiple images into a single image including a middle image of the multiple images and representing an intermediary view of the multiple views.
Abstract:
Techniques related to feature detection and matching fisheye images are discussed. Such techniques include determining a geometric constraint for the feature matching using match results from a first image based feature matching and generating resultant matches based on applying the geometric constraint and a second image based feature matching that applies a looser image based feature matching requirement than the image based feature matching.
Abstract:
A plurality of frames of a video recorded by a video camera and depth maps of the plurality of frames are stored in a data storage. One or more target video camera positions are determined. Each frame of the plurality of frames is associated with one or more of the target video camera positions. For each frame, one or more synthesized frames from the viewpoint of the one or more target camera positions associated with that frame are generated by applying a view interpolation algorithm to that frame using the color pixels of that frame and the depth map of that frame. Users can provide their input about the new camera positions and other camera parameters through multiple input modalities. The synthesized frames are concatenated to create a modified video. Other embodiments are also described and claimed.
Abstract:
Disclosed herein is display stacks for virtual reality (VR) displays arranged to present an image where a focal point of the image is presented in higher resolution than a periphery around the focal point. The disclosure provides systems and methods for presenting an image comprising a high resolution having a smaller field of view than that with which a lower resolution image is presented around a periphery of the high resolution image.
Abstract:
An example method is described herein. The method includes executing dense feature matching on an image pair that is down sampled to obtain a first set of feature correspondences for each pixel of the down sampled image pair. The method also includes calculating a neighborhood correspondence based on the first set of feature correspondences for each pixel in a first image of the image pair. Further, the method includes executing sparse feature matching on stereoscopic patch pairs from the image pair based on the neighborhood correspondence for each pixel to obtain correspondence estimates for each stereoscopic patch pair. Finally, the method includes refining the correspondence estimates for each stereoscopic patch pair to obtain a semi-dense set of feature correspondences by applying a geometric constraint to the correspondence estimates and retaining correspondences that satisfy the geometric constraint.