Abstract:
Various embodiments of the present disclosure relate generally to systems and methods for automatic tagging of objects on a multi-view interactive digital media representation of a dynamic entity. According to particular embodiments, the spatial relationship between multiple images and video is analyzed together with location information data, for purposes of creating a representation referred to herein as a multi-view interactive digital media representation for presentation on a device. Multi-view interactive digital media representations correspond to multi-view interactive digital media representations of the dynamic objects in backgrounds. A first multi-view interactive digital media representation of a dynamic object is obtained. Next, the dynamic object is tagged. Then, a second multi-view interactive digital media representation of the dynamic object is generated. Finally, the dynamic object in the second multi-view interactive digital media representation is automatically identified and tagged.
Abstract:
Various embodiments of the present invention relate generally to systems and methods for artificially rendering images using viewpoint interpolation and extrapolation. According to particular embodiments, a method includes moving a set of control points perpendicular to a trajectory between a first frame and a second frame, where the first frame includes a first image captured from a first location and the second frame includes a second image captured from a second location. The set of control points is associated with a layer and each control point is moved based on an associated depth of the control point. The method also includes generating an artificially rendered image corresponding to a third location outside of the trajectory by extrapolating individual control points using the set of control points for the third location and extrapolating pixel locations using the individual control points.
Abstract:
Provided are mechanisms and processes for augmenting multi-view image data with synthetic objects using inertial measurement unit (IMU) and image data. In one example, a process includes receiving a selection of an anchor location in a reference image for a synthetic object to be placed within a multi-view image. Movements between the reference image and a target image are computed using visual tracking information associated with the multi-view image, device orientation corresponding to the multi-view image, and an estimate of the camera's intrinsic parameters. A first synthetic image is then generated by placing the synthetic object at the anchor location using visual tracking information in the multi-view image, orienting the synthetic object using the inverse of the movements computed between the reference image and the target image, and projecting the synthetic object along a ray into a target view associated with the target image. The first synthetic image is overlaid on the target image to generate an augmented image from the target view.
Abstract:
Various embodiments of the present disclosure relate generally to systems and methods for generating multi-view interactive digital media representations in a virtual reality environment. According to particular embodiments, a plurality of images is fused into a first content model and a first context model, both of which include multi-view interactive digital media representations of objects. Next, a virtual reality environment is generated using the first content model and the first context model. The virtual reality environment includes a first layer and a second layer. The user can navigate through and within the virtual reality environment to switch between multiple viewpoints of the content model via corresponding physical movements. The first layer includes the first content model and the second layer includes a second content model and wherein selection of the first layer provides access to the second layer with the second content model.
Abstract:
Various embodiments of the present invention relate generally to systems and methods for analyzing and manipulating images and video. According to particular embodiments, the spatial relationship between multiple images and video is analyzed together with location information data, for purposes of creating a representation referred to herein as a multi-view interactive digital media representation for presentation on a device. Once a multi-view interactive digital media representation is generated, a user can provide navigational inputs, such via tilting of the device, which alter the presentation state of the multi-view interactive digital media representation. The navigational inputs can be analyzed to determine metrics which indicate a user's interest in the multi-view interactive digital media representation.
Abstract:
Various embodiments of the present invention relate generally to systems and processes for artificially rendering images using interpolation of tracked control points. According to particular embodiments, a set of control points is tracked between a first frame and a second frame, where the first frame includes a first image captured from a first location and the second frame includes a second image captured from a second location. An artificially rendered image corresponding to a third location is then generated by interpolating individual control points for the third location using the set of control points and interpolating pixel locations using the individual control points. The individual control points are used to transform image data.
Abstract:
Various embodiments of the present invention relate generally to systems and processes for artificially rendering images using interpolation of tracked control points. According to particular embodiments, a set of control points is tracked between a first frame and a second frame, where the first frame includes a first image captured from a first location and the second frame includes a second image captured from a second location. An artificially rendered image corresponding to a third location is then generated by interpolating individual control points for the third location using the set of control points and interpolating pixel locations using the individual control points. The individual control points are used to transform image data.
Abstract:
Various embodiments of the present invention relate generally to systems and methods for analyzing and manipulating images and video. According to particular embodiments, the spatial relationship between multiple images and video is analyzed together with location information data, for purposes of creating a representation referred to herein as a surround view for presentation on a device. A visual guide can provided for capturing the multiple images used in the surround view. The visual guide can be a synthetic object that is rendered in real-time into the images output to a display of an image capture device. The visual guide can help user keep the image capture device moving along a desired trajectory.
Abstract:
Various embodiments of the present invention relate generally to mechanisms and processes relating to artificially rendering images using viewpoint interpolation and extrapolation. According to particular embodiments, a method includes applying a transform to estimate a path outside the trajectory between a first frame and a second frame, where the first frame includes a first image captured from a first location and the second frame includes a second image captured from a second location. The process also includes generating an artificially rendered image corresponding to a third location positioned on the path. The artificially rendered image is generated by interpolating a transformation from the first location to the third location and from the third location to the second location, gathering image information from the first frame and the second frame by transferring first image information from the first frame to the third frame and second image information from the second frame to the third frame, and combining the first image information and the second image information.
Abstract:
Various examples of the present disclosure include techniques and mechanisms for generating a surround view. According to various examples, a surround view is constructed from multiple images that are captured from different locations. A computer processor is used to create a three dimensional model that includes the content and context of the surround view. In some examples, the content and context can be segmented such that separate three dimensional models can be provided for each of the content of the surround view and the context of the surround view.