Abstract:
Disclosed embodiments pertain to apparatus, systems, and methods for mixed reality. In some embodiments, a camera pose relative to a tracked object in a live image may be determined and used to render synthetic images from keyframes in a 3D model without the tracked object. Optical flow magnitudes for pixels in a first mask region relative to a subset of the synthetic images may be determined and the optical flow magnitudes may be used to determine pixels in each of the subset of synthetic images that correspond to pixels in the first mask. For each pixel in the first mask, a corresponding replacement pixel may be determined as a function of pixels in the subset of synthetic images that correspond to the corresponding pixel in the first mask.
Abstract:
Methods, systems, computer-readable media, and apparatuses for hierarchical clustering for view management in augmented reality are presented. For example one disclosed method includes the steps of accessing point of interest (POI) metadata for a plurality of points of interest associated with a scene; generating a hierarchical cluster tree for at least a portion of the POIs; establishing a plurality of subdivisions associated with the scene; selecting a plurality of POIs from the hierarchical cluster tree for display based on an augmented reality (AR) viewpoint of the scene, the plurality of subdivisions, and a traversal of at least a portion of the hierarchical cluster tree; and displaying labels comprising POI metadata associated with the selected plurality of POIs, the displaying based on placements determined using image-based saliency.
Abstract:
Exemplary methods, apparatuses, and systems for performing wide area localization from simultaneous localization and mapping (SLAM) maps are disclosed. A mobile device can select a first keyframe based SLAM map of the local environment with one or more received images. A respective localization of the mobile device within the local environment can be determined, and the respective localization may be based on the keyframe based SLAM map. The mobile device can send the first keyframe to a server and receive a first global localization response representing a correction to a local map on the mobile device. The first global localization response can include rotation, translation, and scale information. A server can receive keyframes from a mobile device, and localize the keyframes within a server map by matching keyframe features received from the mobile device to server map features.
Abstract:
The present disclosure relates to methods and apparatus for graphics processing. The apparatus can determine geometry information for each of a plurality of primitives associated with a viewpoint in a scene. The apparatus can also calculate at least one of surface information and disocclusion information based on the geometry information for each of the plurality of primitives, where the surface information and the disocclusion information may be associated with a volumetric grid based on a viewing area corresponding to the viewpoint. Also, the apparatus can calculate visibility information for each of the plurality of primitives based on at least one of the surface information and the disocclusion information, where the visibility information may be associated with the volumetric grid. The apparatus can also determine whether each of the plurality of primitives is visible based on the visibility information for each of the plurality of primitives.
Abstract:
An example system includes a first computing device comprising a first graphics processing unit (GPU) implemented in circuitry, and a second computing device comprising a second GPU implemented in circuitry. The first GPU is configured to perform a first portion of an image rendering process to generate intermediate graphics data and send the intermediate graphics data to the second computing device. The second GPU is configured to perform a second portion of the image rendering process to render an image from the intermediate graphics data. The first computing device may be a video game console, and the second computing device may be a virtual reality (VR) headset that warps the rendered image to produce a stereoscopic image pair.
Abstract:
Methods, systems, computer-readable media, and apparatuses for radiance transfer sampling for augmented reality are presented. In some embodiments, a method includes receiving at least one video frame of an environment. The method further includes generating a surface reconstruction of the environment. The method additionally includes projecting a plurality of rays within the surface reconstruction of the environment. Upon projecting a plurality of rays within the surface reconstruction of the environment, the method includes generating illumination data of the environment from the at least one video frame. The method also includes determining a subset of rays from the plurality of rays in the environment based on areas within the environment needing refinement. The method further includes rendering the virtual object over the video frames based on the plurality of rays excluding the subset of rays.
Abstract:
Techniques are presented for constructing a digital representation of a physical environment. In some embodiments, a method includes obtaining image data indicative of the physical environment; receiving gesture input data from a user corresponding to at least one location in the physical environment, based on the obtained image data; detecting at least one discontinuity in the physical environment near the at least one location corresponding to the received gesture input data; and generating a digital surface corresponding to a surface in the physical environment, based on the received gesture input data and the at least one discontinuity.
Abstract:
A mobile device uses an image-driven view management approach for annotating images in real-time. An image-based layout process used by the mobile device computes a saliency map and generates an edge map from a frame of a video stream. The saliency map may be further processed by applying thresholds to reduce the number of saliency levels. The saliency map and edge map are used together to determine a layout position of labels to be rendered over the video stream. The labels are displayed in the layout position until a change of orientation of the camera that exceeds a threshold is detected. Additionally, the representation of the label may be adjusted, e.g., based on a plurality of pixels bounded by an area that is coincident with a layout position for a label in the video frame.
Abstract:
Methods for determination of AR lighting with dynamic geometry are disclosed. A camera pose for a first image comprising a plurality of pixels may be determined, where each pixel in the first image comprises a depth value and a color value. The first image may correspond to a portion of a 3D model. A second image may be obtained by projecting the portion of the 3D model into a camera field of view based on the camera pose. A composite image comprising a plurality of composite pixels may be obtained based, in part, on the first image and the second image, where each composite pixel in a subset of the plurality of composite pixels is obtained, based, in part, on a corresponding absolute difference between a depth value of a corresponding pixel in the first image and a depth value of a corresponding pixel in the second image.
Abstract:
Disclosed embodiments pertain to apparatus, systems, and methods for mixed reality. In some embodiments, a camera pose relative to a tracked object in a live image may be determined and used to render synthetic images from keyframes in a 3D model without the tracked object. Optical flow magnitudes for pixels in a first mask region relative to a subset of the synthetic images may be determined and the optical flow magnitudes may be used to determine pixels in each of the subset of synthetic images that correspond to pixels in the first mask. For each pixel in the first mask, a corresponding replacement pixel may be determined as a function of pixels in the subset of synthetic images that correspond to the corresponding pixel in the first mask.