-
公开(公告)号:US11941719B2
公开(公告)日:2024-03-26
申请号:US16255038
申请日:2019-01-23
Applicant: Nvidia Corporation
Inventor: Jonathan Tremblay , Stan Birchfield , Stephen Tyree , Thang To , Jan Kautz , Artem Molchanov
CPC classification number: G06T1/0014 , B25J9/161 , B25J9/1661 , B25J9/1697 , G05B13/00 , G06N3/08 , G06T7/73 , G05D1/0088 , G05D1/0221 , G05D2201/0213 , G06T2207/20081 , G06T2207/20084
Abstract: Various embodiments enable a robot, or other autonomous or semi-autonomous device or system, to receive data involving the performance of a task in the physical world. The data can be provided as input to a perception network to infer a set of percepts about the task, which can correspond to relationships between objects observed during the performance. The percepts can be provided as input to a plan generation network, which can infer a set of actions as part of a plan. Each action can correspond to one of the observed relationships. The plan can be reviewed and any corrections made, either manually or through another demonstration of the task. Once the plan is verified as correct, the plan (and any related data) can be provided as input to an execution network that can infer instructions to cause the robot, and/or another robot, to perform the task.
-
公开(公告)号:US20210326678A1
公开(公告)日:2021-10-21
申请号:US17356140
申请日:2021-06-23
Applicant: NVIDIA Corporation
Inventor: Nikolai Smolyanskiy , Alexey Kamenev , Stan Birchfield
Abstract: Various examples of the present disclosure include a stereoscopic deep neural network (DNN) that produces accurate and reliable results in real-time. Both LIDAR data (supervised training) and photometric error (unsupervised training) may be used to train the DNN in a semi-supervised manner. The stereoscopic DNN may use an exponential linear unit (ELU) activation function to increase processing speeds, as well as a machine learned argmax function that may include a plurality of convolutional layers having trainable parameters to account for context. The stereoscopic DNN may further include layers having an encoder/decoder architecture, where the encoder portion of the layers may include a combination of three-dimensional convolutional layers followed by two-dimensional convolutional layers.
-
公开(公告)号:US20210118166A1
公开(公告)日:2021-04-22
申请号:US16657220
申请日:2019-10-18
Applicant: Nvidia Corporation
Inventor: Jonathan Tremblay , Stan Birchfield , Timothy Lee
Abstract: Apparatuses, systems, and techniques are presented to determine a pose of an object. In at least one embodiment, a network is trained to predict a pose of an autonomous object based, at least in part, on only one image of the autonomous object.
-
公开(公告)号:US12039436B2
公开(公告)日:2024-07-16
申请号:US18160694
申请日:2023-01-27
Applicant: NVIDIA Corporation
Inventor: Nikolai Smolyanskiy , Alexey Kamenev , Stan Birchfield
IPC: G06T7/50 , G01S17/86 , G01S17/89 , G06F18/22 , G06N3/02 , G06N3/045 , G06N3/048 , G06N3/063 , G06N3/084 , G06N3/088 , G06T1/20 , G06T7/593
CPC classification number: G06N3/063 , G01S17/86 , G01S17/89 , G06F18/22 , G06N3/045 , G06N3/048 , G06N3/084 , G06N3/088 , G06T1/20 , G06T7/593 , G06T2207/10012 , G06T2207/10052 , G06T2207/20084
Abstract: Various examples of the present disclosure include a stereoscopic deep neural network (DNN) that produces accurate and reliable results in real-time. Both LIDAR data (supervised training) and photometric error (unsupervised training) may be used to train the DNN in a semi-supervised manner. The stereoscopic DNN may use an exponential linear unit (ELU) activation function to increase processing speeds, as well as a machine learned argmax function that may include a plurality of convolutional layers having trainable parameters to account for context. The stereoscopic DNN may further include layers having an encoder/decoder architecture, where the encoder portion of the layers may include a combination of three-dimensional convolutional layers followed by two-dimensional convolutional layers.
-
公开(公告)号:US20230281847A1
公开(公告)日:2023-09-07
申请号:US17592096
申请日:2022-02-03
Applicant: NVIDIA Corporation
Inventor: Yiran Zhong , Charles Loop , Nikolai Smolyanskiy , Ke Chen , Stan Birchfield , Alexander Popov
CPC classification number: G06T7/55 , G06T7/70 , G06V10/462 , G06T2207/20081 , G06T2207/30252
Abstract: In various examples, methods and systems are provided for estimating depth values for images (e.g., from a monocular sequence). Disclosed approaches may define a search space of potential pixel matches between two images using one or more depth hypothesis planes based at least on a camera pose associated with one or more cameras used to generate the images. A machine learning model(s) may use this search space to predict likelihoods of correspondence between one or more pixels in the images. The predicted likelihoods may be used to compute depth values for one or more of the images. The predicted depth values may be transmitted and used by a machine to perform one or more operations.
-
公开(公告)号:US20210326694A1
公开(公告)日:2021-10-21
申请号:US16852944
申请日:2020-04-20
Applicant: Nvidia Corporation
Inventor: Jialiang Wang , Varun Jampani , Stan Birchfield , Charles Loop , Jan Kautz
Abstract: Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.
-
公开(公告)号:US11080590B2
公开(公告)日:2021-08-03
申请号:US16356439
申请日:2019-03-18
Applicant: NVIDIA Corporation
Inventor: Nikolai Smolyanskiy , Alexey Kamenev , Stan Birchfield
IPC: G06N3/04 , G06N3/08 , G06N3/10 , G01S17/00 , G06T7/00 , G06T7/593 , G06T1/20 , G06K9/62 , G06N3/063 , G01S17/86 , G01S17/89
Abstract: Various examples of the present disclosure include a stereoscopic deep neural network (DNN) that produces accurate and reliable results in real-time. Both LIDAR data (supervised training) and photometric error (unsupervised training) may be used to train the DNN in a semi-supervised manner. The stereoscopic DNN may use an exponential linear unit (ELU) activation function to increase processing speeds, as well as a machine learned argmax function that may include a plurality of convolutional layers having trainable parameters to account for context. The stereoscopic DNN may further include layers having an encoder/decoder architecture, where the encoder portion of the layers may include a combination of three-dimensional convolutional layers followed by two-dimensional convolutional layers.
-
公开(公告)号:US20190228495A1
公开(公告)日:2019-07-25
申请号:US16255038
申请日:2019-01-23
Applicant: Nvidia Corporation
Inventor: Jonathan Tremblay , Stan Birchfield , Stephen Tyree , Thang To , Jan Kautz , Artem Molchanov
Abstract: Various embodiments enable a robot, or other autonomous or semi-autonomous device or system, to receive data involving the performance of a task in the physical world. The data can be provided as input to a perception network to infer a set of percepts about the task, which can correspond to relationships between objects observed during the performance. The percepts can be provided as input to a plan generation network, which can infer a set of actions as part of a plan. Each action can correspond to one of the observed relationships. The plan can be reviewed and any corrections made, either manually or through another demonstration of the task. Once the plan is verified as correct, the plan (and any related data) can be provided as input to an execution network that can infer instructions to cause the robot, and/or another robot, to perform the task.
-
公开(公告)号:US20230169321A1
公开(公告)日:2023-06-01
申请号:US18160694
申请日:2023-01-27
Applicant: NVIDIA Corporation
Inventor: Nikolai Smolyanskiy , Alexey Kamenev , Stan Birchfield
IPC: G06N3/063 , G06T7/593 , G06N3/084 , G06N3/088 , G06T1/20 , G01S17/86 , G01S17/89 , G06F18/22 , G06N3/045 , G06N3/048
CPC classification number: G06N3/063 , G06T7/593 , G06N3/084 , G06N3/088 , G06T1/20 , G01S17/86 , G01S17/89 , G06F18/22 , G06N3/045 , G06N3/048 , G06T2207/10052 , G06T2207/20084 , G06T2207/10012
Abstract: Various examples of the present disclosure include a stereoscopic deep neural network (DNN) that produces accurate and reliable results in real-time. Both LIDAR data (supervised training) and photometric error (unsupervised training) may be used to train the DNN in a semi-supervised manner. The stereoscopic DNN may use an exponential linear unit (ELU) activation function to increase processing speeds, as well as a machine learned argmax function that may include a plurality of convolutional layers having trainable parameters to account for context. The stereoscopic DNN may further include layers having an encoder/decoder architecture, where the encoder portion of the layers may include a combination of three-dimensional convolutional layers followed by two-dimensional convolutional layers.
-
公开(公告)号:US11604967B2
公开(公告)日:2023-03-14
申请号:US17356140
申请日:2021-06-23
Applicant: NVIDIA Corporation
Inventor: Nikolai Smolyanskiy , Alexey Kamenev , Stan Birchfield
IPC: G01S17/88 , G01S17/894 , G06N3/02 , G06N3/084 , G06T7/50 , G06T7/80 , G06N3/04 , G06T7/593 , G06N3/088 , G06T1/20 , G06K9/62 , G06N3/063 , G01S17/86 , G01S17/89
Abstract: Various examples of the present disclosure include a stereoscopic deep neural network (DNN) that produces accurate and reliable results in real-time. Both LIDAR data (supervised training) and photometric error (unsupervised training) may be used to train the DNN in a semi-supervised manner. The stereoscopic DNN may use an exponential linear unit (ELU) activation function to increase processing speeds, as well as a machine learned argmax function that may include a plurality of convolutional layers having trainable parameters to account for context. The stereoscopic DNN may further include layers having an encoder/decoder architecture, where the encoder portion of the layers may include a combination of three-dimensional convolutional layers followed by two-dimensional convolutional layers.
-
-
-
-
-
-
-
-
-