-
公开(公告)号:US12141986B2
公开(公告)日:2024-11-12
申请号:US18333166
申请日:2023-06-12
Applicant: Nvidia Corporation
Inventor: David Jesus Acuna Marrero , Towaki Takikawa , Varun Jampani , Sanja Fidler
Abstract: Various types of image analysis benefit from a multi-stream architecture that allows the analysis to consider shape data. A shape stream can process image data in parallel with a primary stream, where data from layers of a network in the primary stream is provided as input to a network of the shape stream. The shape data can be fused with the primary analysis data to produce more accurate output, such as to produce accurate boundary information when the shape data is used with semantic segmentation data produced by the primary stream. A gate structure can be used to connect the intermediate layers of the primary and shape streams, using higher level activations to gate lower level activations in the shape stream. Such a gate structure can help focus the shape stream on the relevant information and reduces any additional weight of the shape stream.
-
公开(公告)号:US11715251B2
公开(公告)日:2023-08-01
申请号:US17507620
申请日:2021-10-21
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Aayush Prakash , Mark A. Brophy , Varun Jampani , Cem Anil , Stanley Thomas Birchfield , Thang Hong To , David Jesus Acuna Marrero
IPC: G06T15/00 , G06T15/04 , G06T15/50 , G06T15/20 , G06F18/214 , G06F18/211 , G06V10/774 , G06V10/82 , G06N3/04 , G06N3/084
CPC classification number: G06T15/00 , G06F18/211 , G06F18/2148 , G06T15/04 , G06T15/20 , G06T15/50 , G06V10/7747 , G06V10/82 , G06N3/04 , G06N3/084 , G06T2210/12 , G06V2201/07
Abstract: Training deep neural networks requires a large amount of labeled training data. Conventionally, labeled training data is generated by gathering real images that are manually labelled which is very time-consuming. Instead of manually labelling a training dataset, domain randomization technique is used generate training data that is automatically labeled. The generated training data may be used to train neural networks for object detection and segmentation (labelling) tasks. In an embodiment, the generated training data includes synthetic input images generated by rendering three-dimensional (3D) objects of interest in a 3D scene. In an embodiment, the generated training data includes synthetic input images generated by rendering 3D objects of interest on a 2D background image. The 3D objects of interest are objects that a neural network is trained to detect and/or label.
-
公开(公告)号:US20220391766A1
公开(公告)日:2022-12-08
申请号:US17827390
申请日:2022-05-27
Applicant: NVIDIA Corporation
Inventor: David Jesus Acuna Marrero , Sanja Fidler , Jonah Philion
IPC: G06N20/00
Abstract: In various examples, systems and methods are disclosed that use a domain-adaptation theory to minimize the reality gap between simulated and real-world domains for training machine learning models. For example, sampling of spatial priors may be used to generate synthetic data that that more closely matches the diversity of data from the real-world. To train models using this synthetic data that still perform well in the real-world, the systems and methods of the present disclosure may use a discriminator that allows a model to learn domain-invariant representations to minimize the divergence between the virtual world and the real-world in a latent space. As such, the techniques described herein allow for a principled approach to learn neural-invariant representations and a theoretically inspired approach on how to sample data from a simulator that, in combination, allow for training of machine learning models using synthetic data.
-
公开(公告)号:US20230342941A1
公开(公告)日:2023-10-26
申请号:US18333166
申请日:2023-06-12
Applicant: Nvidia Corporation
Inventor: David Jesus Acuna Marrero , Towaki Takikawa , Varun Jampani , Sanja Fidler
CPC classification number: G06T7/12 , G06V20/56 , G06F18/253 , G06V10/764 , G06V10/806 , G06V10/82 , G06V10/454 , G06V10/255 , G06T2207/30252 , G06T2207/20084 , G06T2207/20081
Abstract: Various types of image analysis benefit from a multi-stream architecture that allows the analysis to consider shape data. A shape stream can process image data in parallel with a primary stream, where data from layers of a network in the primary stream is provided as input to a network of the shape stream. The shape data can be fused with the primary analysis data to produce more accurate output, such as to produce accurate boundary information when the shape data is used with semantic segmentation data produced by the primary stream. A gate structure can be used to connect the intermediate layers of the primary and shape streams, using higher level activations to gate lower level activations in the shape stream. Such a gate structure can help focus the shape stream on the relevant information and reduces any additional weight of the shape stream.
-
公开(公告)号:US20220391781A1
公开(公告)日:2022-12-08
申请号:US17827446
申请日:2022-05-27
Applicant: NVIDIA Corporation
Inventor: Or Litany , Haggai Maron , David Jesus Acuna Marrero , Jan Kautz , Sanja Fidler , Gal Chechik
Abstract: A method performed by a server is provided. The method comprises sending copies of a set of parameters of a hyper network (HN) to at least one client device, receiving from each client device in the at least one client device, a corresponding set of updated parameters of the HN, and determining a next set of parameters of the HN based on the corresponding sets of updated parameters received from the at least one client device. Each client device generates the corresponding set of updated parameters based on a local model architecture of the client device.
-
公开(公告)号:US20210097346A1
公开(公告)日:2021-04-01
申请号:US17119971
申请日:2020-12-11
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Aayush Prakash , Mark A. Brophy , Varun Jampani , Cem Anil , Stanley Thomas Birchfield , Thang Hong To , David Jesus Acuna Marrero
Abstract: Training deep neural networks requires a large amount of labeled training data. Conventionally, labeled training data is generated by gathering real images that are manually labelled which is very time-consuming. Instead of manually labelling a training dataset, domain randomization technique is used generate training data that is automatically labeled. The generated training data may be used to train neural networks for object detection and segmentation (labelling) tasks. In an embodiment, the generated training data includes synthetic input images generated by rendering three-dimensional (3D) objects of interest in a 3D scene. In an embodiment, the generated training data includes synthetic input images generated by rendering 3D objects of interest on a 2D background image. The 3D objects of interest are objects that a neural network is trained to detect and/or label.
-
公开(公告)号:US20200160178A1
公开(公告)日:2020-05-21
申请号:US16685795
申请日:2019-11-15
Applicant: NVIDIA Corporation
Inventor: Amlan Kar , Aayush Prakash , Ming-Yu Liu , David Jesus Acuna Marrero , Antonio Torralba Barriuso , Sanja Fidler
IPC: G06N3/08 , G06F16/901 , G06N3/04 , G06T11/60
Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar—and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.
-
公开(公告)号:US20250139783A1
公开(公告)日:2025-05-01
申请号:US18943472
申请日:2024-11-11
Applicant: Nvidia Corporation
Inventor: David Jesus Acuna Marrero , Towaki Takikawa , Varun Jampani , Sanja Fidler
Abstract: Various types of image analysis benefit from a multi-stream architecture that allows the analysis to consider shape data. A shape stream can process image data in parallel with a primary stream, where data from layers of a network in the primary stream is provided as input to a network of the shape stream. The shape data can be fused with the primary analysis data to produce more accurate output, such as to produce accurate boundary information when the shape data is used with semantic segmentation data produced by the primary stream. A gate structure can be used to connect the intermediate layers of the primary and shape streams, using higher level activations to gate lower level activations in the shape stream. Such a gate structure can help focus the shape stream on the relevant information and reduces any additional weight of the shape stream.
-
公开(公告)号:US20230229919A1
公开(公告)日:2023-07-20
申请号:US18186696
申请日:2023-03-20
Applicant: NVIDIA Corporation
Inventor: Amlan Kar , Aayush Prakash , Ming-Yu Liu , David Jesus Acuna Marrero , Antonio Torralba Barriuso , Sanja Fidler
IPC: G06N3/08 , G06F16/901 , G06T11/60 , G06N3/045 , G06V10/764 , G06V10/774 , G06V10/82 , G06V10/426
CPC classification number: G06N3/08 , G06F16/9024 , G06T11/60 , G06N3/045 , G06V10/764 , G06V10/774 , G06V10/82 , G06V10/426 , G06T2210/61
Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar— and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.
-
公开(公告)号:US11610115B2
公开(公告)日:2023-03-21
申请号:US16685795
申请日:2019-11-15
Applicant: NVIDIA Corporation
Inventor: Amlan Kar , Aayush Prakash , Ming-Yu Liu , David Jesus Acuna Marrero , Antonio Torralba Barriuso , Sanja Fidler
IPC: G06N3/08 , G06F16/901 , G06T11/60 , G06N3/04
Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar—and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.
-
-
-
-
-
-
-
-
-