Abstract:
A system and method is provided of detecting user manipulation of an inanimate object and interpreting that manipulation as input. In one aspect, the manipulation may be detected by an image capturing component of a computing device, and the manipulation is interpreted as an instruction to execute a command, such as opening up a drawing application in response to a user picking up a pen. The manipulation may also be detected with the aid of an audio capturing device, e.g., a microphone on the computing device.
Abstract:
An exemplary method includes prompting a user to capture video data at a location. The location is associated with navigation directions for the user. Information representing visual orientation and positioning information associated with the captured video data is received by one or more computing devices, and a stored data model representing a 3D geometry depicting objects associated with the location is accessed. Between corresponding images from the captured video data and projections of the 3D geometry, one or more candidate change regions are detected. Each candidate change region indicates an area of visual difference between the captured video data and projections. When it is detected that a count of the one or more candidate change regions is below a threshold, the stored model data is updated with at least part of the captured video data based on the visual orientation and positioning information associated with the captured video data.
Abstract:
Aspects of the disclosure relate generally to generating depth data from a video. As an example, one or more computing devices may receive an initialization request for a still image capture mode. After receiving the request to initialize the still image capture mode, the one or more computing devices may automatically begin to capture a video including a plurality of image frames. The one or more computing devices track features between a first image frame of the video and each of the other image frames of the video. Points corresponding to the tracked features may be generated by the one or more computing devices using a set of assumptions. The assumptions may include a first assumption that there is no rotation and a second assumption that there is no translation. The one or more computing devices then generate a depth map based at least in part on the points.
Abstract:
Systems and methods are related to a camera rig and generating stereoscopic panoramas from captured images for display in a virtual reality (VR) environment.
Abstract:
Systems and methods are described for defining a set of images based on captured images, receiving a viewing direction associated with a user of a virtual reality (VR) head mounted display, receiving an indication of a change in the viewing direction. The methods further include configuring, a re-projection of a portion of the set of images, the re-projection based at least in part on the changed viewing direction and a field of view associated with the captured images, and converting the portion from a spherical perspective projection into a planar perspective projection, rendering by the computing device and for display in the VR head mounted display, an updated view based on the re-projection, the updated view configured to correct distortion and provide stereo parallax in the portion, and providing, to the head mounted display, the updated view including a stereo panoramic scene corresponding to the changed viewing direction.
Abstract:
Systems and methods for the generation of depth data for a scene using images captured by a camera-enabled mobile device are provided. According to a particular implementation of the present disclosure, a reference image can be captured of a scene with an image capture device, such as an image capture device integrated with a camera-enabled mobile device. A short video or sequence of images can then be captured from multiple different poses relative to the reference scene. The captured image and video can then be processed using computer vision techniques to produce an image with associated depth data, such as an RGBZ image.
Abstract:
Aspects of the disclosure relate to providing users with sequences of images of physical locations over time or time-lapses. In order to do so, a set of images of a physical location may be identified. From the set of images, a representative image may be selected. The set may then be filtered by comparing the other images in the set to the representative image. The images in the filtered set may then be aligned to the representative image. From this set, a time-lapsed sequence of images may be generated, and the amount of change in the time-lapsed sequence of images may be determined. At the request of a user device for a time-lapsed image representation of the specified physical location, the generated time-lapsed sequence of images may be provided.
Abstract:
Systems and methods are described for defining a set of images based on captured images, receiving a viewing direction associated with a user of a virtual reality (VR) head mounted display, receiving an indication of a change in the viewing direction. The methods further include configuring, a re-projection of a portion of the set of images, the re-projection based at least in part on the changed viewing direction and a field of view associated with the captured images, and converting the portion from a spherical perspective projection into a planar perspective projection, rendering by the computing device and for display in the VR head mounted display, an updated view based on the re-projection, the updated view configured to correct distortion and provide stereo parallax in the portion, and providing, to the head mounted display, the updated view including a stereo panoramic scene corresponding to the changed viewing direction.
Abstract:
Aspects of the disclosure relate to capturing panoramic images using a computing device. For example, the computing device may record a set of video frames and tracking features each including one or more features that appear in two or more video frames of the set of video frames within the set of video frames may be determined. A set of frame-based features based on the displacement of the tracking features between two or more video frames of the set of video frames may be determined by the computing device. A set of historical feature values based on the set of frame-based features may also be determined by the computing device. The computing device may determine then whether a user is attempting to capture a panoramic image based on the set of historical feature values. In response, the computing device may capture a panoramic image.
Abstract:
An exemplary method for navigating among photos includes determining, using one or more computing devices, visual characteristics of a person depicted in a first image associated with a first location. These visual characteristics of the person are detected in a second image associated with a second location. Using the one or more computing devices, a series of intermediate images are identified based on the first location and the second location. Each intermediate image is associated with a location. The series of intermediate images and the second image are provided. Images of an intermediate destination from the series of intermediate images are selected based on a density of images at the intermediate destination. A 3D reconstruction of the intermediate destination is then generated based on the selected images. Thereafter, a visual presentation of images traversing through the 3D reconstruction of the intermediate destination to the second image is prepared for display.