-
公开(公告)号:US12046002B1
公开(公告)日:2024-07-23
申请号:US17684197
申请日:2022-03-01
Applicant: Amazon Technologies, Inc.
Inventor: Xiaohan Nie , Michael Thomas Pecchia , Leo Chan , Ahmed Aly Saad Ahmed , Muhammad Raffay Hamid , Sheng Liu
Abstract: Systems, devices, and methods are provided for depth guided structure from motion. A system may obtain a plurality of image frames from a digital content item that corresponds to a scene and determine, based at least in part on a correspondence search, a set of 2-D keypoints for the plurality of image frames. A depth estimator may be used to determine a plurality of dense depth map for the plurality of image frames. The set of 2-D keypoints and the plurality of dense depth maps may be used to determine a corresponding set of depth priors. Initialization and/or depth-regularized optimization may be performed using the keypoints and depth priors.
-
公开(公告)号:US20240346686A1
公开(公告)日:2024-10-17
申请号:US18749025
申请日:2024-06-20
Applicant: Amazon Technologies, Inc.
Inventor: Xiaohan Nie , Michael Thomas Pecchia , Leo Chan , Ahmed Aly Saad Ahmed , Muhammad Raffay Hamid , Sheng Liu
Abstract: Systems, devices, and methods are provided for depth-guided structure from motion. A system may obtain a plurality of image frames from a digital content item that corresponds to a scene and determine, based at least in part on a correspondence search, a set of 2-D keypoints for the plurality of image frames. A depth estimator may be used to determine a plurality of dense depth map for the plurality of image frames. The set of 2-D keypoints and the plurality of dense depth maps may be used to determine a corresponding set of depth priors. Initialization and/or depth-regularized optimization may be performed using the keypoints and depth priors.
-
公开(公告)号:US10586369B1
公开(公告)日:2020-03-10
申请号:US15885369
申请日:2018-01-31
Applicant: Amazon Technologies, Inc.
Inventor: Kyle Michael Roche , David Chiapperino , Christine Morten , Kathleen Alison Curry , Leo Chan
Abstract: One or more services may generate audio data and animations of an avatar based on input text. A speech input ingestion (SII) service may identify tags of objects in a virtual environment and associate tags of those objects with words in the input text, which may be stored as metadata in speech markup data. This association may enable an animation service to generate gestures toward objects while animating an avatar, or may be used to create animations or effects of the object. The SII service may analyze input text to identify dialog including multiple speakers associated with the text. The SII service may create metadata to associate certain words with respective speakers (avatars) of those words, which may be processed by the animation service to animate multiple avatars speaking the dialog.
-
-