THREE-DIMENSIONAL OBJECT RECONSTRUCTION FROM A VIDEO

    公开(公告)号:US20220270318A1

    公开(公告)日:2022-08-25

    申请号:US17734244

    申请日:2022-05-02

    Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

    Estimating depth for a video stream captured with a monocular rgb camera

    公开(公告)号:US10984545B2

    公开(公告)日:2021-04-20

    申请号:US16439539

    申请日:2019-06-12

    Abstract: Techniques for estimating depth for a video stream captured by a monocular image sensor are disclosed. A sequence of image frames are captured by the monocular image sensor. A first neural network is configured to process at least a portion of the sequence of image frames to generate a depth probability volume. The depth probability volume includes a plurality of probability maps corresponding to a number of discrete depth candidate locations over a range of depths defined for the scene. The depth probability volume can be updated using a second neural network that is configured to generate adaptive gain parameters to integrate the DPVs over time. A third neural network is configured to refine the updated depth probability volume from a lower resolution to a higher resolution that matches the original resolution of the sequence of image frames. A depth map can be calculated based on the depth probability volume.

    Guided hallucination for missing image content using a neural network

    公开(公告)号:US10922793B2

    公开(公告)日:2021-02-16

    申请号:US16353195

    申请日:2019-03-14

    Abstract: Missing image content is generated using a neural network. In an embodiment, a high resolution image and associated high resolution semantic label map are generated from a low resolution image and associated low resolution semantic label map. The input image/map pair (low resolution image and associated low resolution semantic label map) lacks detail and is therefore missing content. Rather than simply enhancing the input image/map pair, data missing in the input image/map pair is improvised or hallucinated by a neural network, creating plausible content while maintaining spatio-temporal consistency. Missing content is hallucinated to generate a detailed zoomed in portion of an image. Missing content is hallucinated to generate different variations of an image, such as different seasons or weather conditions for a driving video.

    FAST MULTI-SCALE POINT CLOUD REGISTRATION WITH A HIERARCHICAL GAUSSIAN MIXTURE

    公开(公告)号:US20190319851A1

    公开(公告)日:2019-10-17

    申请号:US16351312

    申请日:2019-03-12

    Abstract: Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, object/scene recognition, and augmented reality (AR). A new registration algorithm is presented that achieves speed and accuracy by registering a point cloud to a representation of a reference point cloud. A target point cloud is registered to the reference point cloud by iterating through a number of cycles of an EM algorithm where, during an Expectation step, each point in the target point cloud is associated with a node of a hierarchical tree data structure and, during a Maximization step, an estimated transformation is determined based on the association of the points with corresponding nodes of the hierarchical tree data structure. The estimated transformation is determined by solving a minimization problem associated with a sum, over a number of mixture components, over terms related to a Mahalanobis distance.

    Learning-Based Camera Pose Estimation From Images of an Environment

    公开(公告)号:US20190108651A1

    公开(公告)日:2019-04-11

    申请号:US16137064

    申请日:2018-09-20

    Abstract: A deep neural network (DNN) system learns a map representation for estimating a camera position and orientation (pose). The DNN is trained to learn a map representation corresponding to the environment, defining positions and attributes of structures, trees, walls, vehicles, walls, etc. The DNN system learns a map representation that is versatile and performs well for many different environments (indoor, outdoor, natural, synthetic, etc.). The DNN system receives images of an environment captured by a camera (observations) and outputs an estimated camera pose within the environment. The estimated camera pose is used to perform camera localization, i.e., recover the three-dimensional (3D) position and orientation of a moving camera, which is a fundamental task in computer vision with a wide variety of applications in robot navigation, car localization for autonomous driving, device localization for mobile navigation, and augmented/virtual reality.

    View synthesis for dynamic scenes

    公开(公告)号:US11546568B1

    公开(公告)日:2023-01-03

    申请号:US16811356

    申请日:2020-03-06

    Abstract: Apparatuses, systems, and techniques are presented to perform monocular view synthesis of a dynamic scene. Single and multi-view depth information can be determined for a collection of images of a dynamic scene, and a blender network can be used to combine image features for foreground, background, and missing image regions using fused depth maps inferred form the single and multi-view depth information.

    Three-dimensional object reconstruction from a video

    公开(公告)号:US11354847B2

    公开(公告)日:2022-06-07

    申请号:US16945455

    申请日:2020-07-31

    Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

Patent Agency Ranking