-
公开(公告)号:US20240253217A1
公开(公告)日:2024-08-01
申请号:US18538248
申请日:2023-12-13
Applicant: NVIDIA Corporation
Inventor: Arash Vahdat , Hongxu Yin , Jan Kautz , Jiaming Song , Ming-Yu Liu , Morteza Mardani , Qinsheng Zhang
IPC: B25J9/16
CPC classification number: B25J9/163 , B25J9/1664 , B25J9/1697
Abstract: Apparatuses, systems, and techniques to calculate a combined loss value based on applying one or more loss functions to the plurality of samples generated by a diffusion model to update the samples to determine a synthesized motions of one or more objects.
-
公开(公告)号:US11941719B2
公开(公告)日:2024-03-26
申请号:US16255038
申请日:2019-01-23
Applicant: Nvidia Corporation
Inventor: Jonathan Tremblay , Stan Birchfield , Stephen Tyree , Thang To , Jan Kautz , Artem Molchanov
CPC classification number: G06T1/0014 , B25J9/161 , B25J9/1661 , B25J9/1697 , G05B13/00 , G06N3/08 , G06T7/73 , G05D1/0088 , G05D1/0221 , G05D2201/0213 , G06T2207/20081 , G06T2207/20084
Abstract: Various embodiments enable a robot, or other autonomous or semi-autonomous device or system, to receive data involving the performance of a task in the physical world. The data can be provided as input to a perception network to infer a set of percepts about the task, which can correspond to relationships between objects observed during the performance. The percepts can be provided as input to a plan generation network, which can infer a set of actions as part of a plan. Each action can correspond to one of the observed relationships. The plan can be reviewed and any corrections made, either manually or through another demonstration of the task. Once the plan is verified as correct, the plan (and any related data) can be provided as input to an execution network that can infer instructions to cause the robot, and/or another robot, to perform the task.
-
公开(公告)号:US20240070874A1
公开(公告)日:2024-02-29
申请号:US18135654
申请日:2023-04-17
Applicant: NVIDIA Corporation
Inventor: Muhammed Kocabas , Ye Yuan , Umar Iqbal , Pavlo Molchanov , Jan Kautz
CPC classification number: G06T7/20 , G06T7/70 , G06T2207/20084 , G06T2207/30196 , G06T2207/30252 , G06T2210/12
Abstract: Estimating motion of a human or other object in video is a common computer task with applications in robotics, sports, mixed reality, etc. However, motion estimation becomes difficult when the camera capturing the video is moving, because the observed object and camera motions are entangled. The present disclosure provides for joint estimation of the motion of a camera and the motion of articulated objects captured in video by the camera.
-
144.
公开(公告)号:US20240054720A1
公开(公告)日:2024-02-15
申请号:US17886081
申请日:2022-08-11
Applicant: Nvidia Corporation
Inventor: Sanja Fidler , Zian Wang , Jan Kautz , Wenzheng Chen
CPC classification number: G06T15/506 , G06T5/009 , G06T7/586 , G06T2207/20081 , G06T2207/20208
Abstract: Systems and methods generate a hybrid lighting model for rendering objects within an image. The hybrid lighting model includes lighting effects attributed to a first source, such as the sun, and to a second source, such as spatially-varying effects of objects within the image. The hybrid lighting model may be generated for an input image and then one or more virtual objects may be rendered to appear as if part of the input image, where the hybrid lighting model is used to apply one or more lighting effects to the one or more virtual objects.
-
公开(公告)号:US20230368501A1
公开(公告)日:2023-11-16
申请号:US18114177
申请日:2023-02-24
Applicant: NVIDIA Corporation
Inventor: Seonwook Park , Shalini De Mello , Pavlo Molchanov , Umar Iqbal , Jan Kautz
IPC: G06V10/772 , G06F7/57 , G06F17/18 , G06N3/088 , G06N3/045 , G06N3/047 , G06V10/774 , G06V10/82
CPC classification number: G06V10/772 , G06F7/57 , G06F17/18 , G06N3/088 , G06N3/045 , G06N3/047 , G06V10/774 , G06V10/82
Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.
-
公开(公告)号:US11748887B2
公开(公告)日:2023-09-05
申请号:US16378464
申请日:2019-04-08
Applicant: NVIDIA Corporation
Inventor: Varun Jampani , Wei-Chih Hung , Sifei Liu , Pavlo Molchanov , Jan Kautz
IPC: G06V10/00 , G06T7/11 , G06T7/143 , G06F17/15 , G06N3/088 , G06F18/40 , G06N3/045 , G06N3/047 , G06V10/764 , G06V10/82 , G06V10/94 , G06V20/40
CPC classification number: G06T7/11 , G06F17/15 , G06F18/40 , G06N3/045 , G06N3/047 , G06N3/088 , G06T7/143 , G06V10/764 , G06V10/82 , G06V10/945 , G06V20/41
Abstract: Systems and methods to detect one or more segments of one or more objects within one or more images based, at least in part, on a neural network trained in an unsupervised manner to infer the one or more segments. Systems and methods to help train one or more neural networks to detect one or more segments of one or more objects within one or more images in an unsupervised manner.
-
公开(公告)号:US11594006B2
公开(公告)日:2023-02-28
申请号:US16998914
申请日:2020-08-20
Applicant: NVIDIA Corporation
Inventor: Xiaodong Yang , Xitong Yang , Sifei Liu , Jan Kautz
Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.
-
公开(公告)号:US11496773B2
公开(公告)日:2022-11-08
申请号:US17352064
申请日:2021-06-18
Applicant: NVIDIA Corporation
Inventor: Yi-Hsuan Tsai , Ming-Yu Liu , Deqing Sun , Ming-Hsuan Yang , Jan Kautz
IPC: H04N19/85 , H04N19/91 , H04N19/436 , H04N19/46
Abstract: A method, computer readable medium, and system are disclosed for identifying residual video data. This data describes data that is lost during a compression of original video data. For example, the original video data may be compressed and then decompressed, and this result may be compared to the original video data to determine the residual video data. This residual video data is transformed into a smaller format by means of encoding, binarizing, and compressing, and is sent to a destination. At the destination, the residual video data is transformed back into its original format and is used during the decompression of the compressed original video data to improve a quality of the decompressed original video data.
-
公开(公告)号:US11270161B2
公开(公告)日:2022-03-08
申请号:US16924005
申请日:2020-07-08
Applicant: NVIDIA Corporation
Inventor: Orazio Gallo , Jinwei Gu , Jan Kautz , Patrick Wieschollek
Abstract: When a computer image is generated from a real-world scene having a semi-reflective surface (e.g. window), the computer image will create, at the semi-reflective surface from the viewpoint of the camera, both a reflection of a scene in front of the semi-reflective surface and a transmission of a scene located behind the semi-reflective surface. Similar to a person viewing the real-world scene from different locations, angles, etc., the reflection and transmission may change, and also move relative to each other, as the viewpoint of the camera changes. Unfortunately, the dynamic nature of the reflection and transmission negatively impacts the performance of many computer applications, but performance can generally be improved if the reflection and transmission are separated. The present disclosure uses deep learning to separate reflection and transmission at a semi-reflective surface of a computer image generated from a real-world scene.
-
公开(公告)号:US20220036635A1
公开(公告)日:2022-02-03
申请号:US16945455
申请日:2020-07-31
Applicant: NVIDIA Corporation
Inventor: Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Jan Kautz
Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.
-
-
-
-
-
-
-
-
-