-
公开(公告)号:US12141986B2
公开(公告)日:2024-11-12
申请号:US18333166
申请日:2023-06-12
Applicant: Nvidia Corporation
Inventor: David Jesus Acuna Marrero , Towaki Takikawa , Varun Jampani , Sanja Fidler
Abstract: Various types of image analysis benefit from a multi-stream architecture that allows the analysis to consider shape data. A shape stream can process image data in parallel with a primary stream, where data from layers of a network in the primary stream is provided as input to a network of the shape stream. The shape data can be fused with the primary analysis data to produce more accurate output, such as to produce accurate boundary information when the shape data is used with semantic segmentation data produced by the primary stream. A gate structure can be used to connect the intermediate layers of the primary and shape streams, using higher level activations to gate lower level activations in the shape stream. Such a gate structure can help focus the shape stream on the relevant information and reduces any additional weight of the shape stream.
-
公开(公告)号:US11715251B2
公开(公告)日:2023-08-01
申请号:US17507620
申请日:2021-10-21
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Aayush Prakash , Mark A. Brophy , Varun Jampani , Cem Anil , Stanley Thomas Birchfield , Thang Hong To , David Jesus Acuna Marrero
IPC: G06T15/00 , G06T15/04 , G06T15/50 , G06T15/20 , G06F18/214 , G06F18/211 , G06V10/774 , G06V10/82 , G06N3/04 , G06N3/084
CPC classification number: G06T15/00 , G06F18/211 , G06F18/2148 , G06T15/04 , G06T15/20 , G06T15/50 , G06V10/7747 , G06V10/82 , G06N3/04 , G06N3/084 , G06T2210/12 , G06V2201/07
Abstract: Training deep neural networks requires a large amount of labeled training data. Conventionally, labeled training data is generated by gathering real images that are manually labelled which is very time-consuming. Instead of manually labelling a training dataset, domain randomization technique is used generate training data that is automatically labeled. The generated training data may be used to train neural networks for object detection and segmentation (labelling) tasks. In an embodiment, the generated training data includes synthetic input images generated by rendering three-dimensional (3D) objects of interest in a 3D scene. In an embodiment, the generated training data includes synthetic input images generated by rendering 3D objects of interest on a 2D background image. The 3D objects of interest are objects that a neural network is trained to detect and/or label.
-
公开(公告)号:US11328173B2
公开(公告)日:2022-05-10
申请号:US17081805
申请日:2020-10-27
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
4.
公开(公告)号:US20200334502A1
公开(公告)日:2020-10-22
申请号:US16921012
申请日:2020-07-06
Applicant: NVIDIA Corporation
Inventor: Wei-Chih Tu , Ming-Yu Liu , Varun Jampani , Deqing Sun , Ming-Hsuan Yang , Jan Kautz
Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties. An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.
-
公开(公告)号:US10748036B2
公开(公告)日:2020-08-18
申请号:US16188641
申请日:2018-11-13
Applicant: NVIDIA Corporation
Inventor: Wei-Chih Tu , Ming-Yu Liu , Varun Jampani , Deqing Sun , Ming-Hsuan Yang , Jan Kautz
Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.
-
6.
公开(公告)号:US12266144B2
公开(公告)日:2025-04-01
申请号:US16690015
申请日:2019-11-20
Applicant: NVIDIA Corporation
Inventor: Siva Karthik Mustikovela , Varun Jampani , Shalini De Mello , Sifei Liu , Umar Iqbal , Jan Kautz
IPC: G06V10/24 , G06F18/21 , G06F18/214 , G06N3/045 , G06N3/08 , G06T7/73 , G06V10/44 , G06V10/764 , G06V10/778 , G06V10/82 , G06V20/56
Abstract: Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.
-
公开(公告)号:US12182940B2
公开(公告)日:2024-12-31
申请号:US17578051
申请日:2022-01-18
Applicant: NVIDIA Corporation
Inventor: Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Varun Jampani , Jan Kautz
IPC: G06T15/00 , G06F18/21 , G06T7/40 , G06T7/73 , G06T17/20 , G06V10/26 , G06V10/776 , G06V10/82 , G06V20/64
Abstract: Apparatuses, systems, and techniques to identify a shape or camera pose of a three-dimensional object from a two-dimensional image of the object. In at least one embodiment, objects are identified in an image using one or more neural networks that have been trained on objects of a similar category and a three-dimensional mesh template.
-
公开(公告)号:US20230342941A1
公开(公告)日:2023-10-26
申请号:US18333166
申请日:2023-06-12
Applicant: Nvidia Corporation
Inventor: David Jesus Acuna Marrero , Towaki Takikawa , Varun Jampani , Sanja Fidler
CPC classification number: G06T7/12 , G06V20/56 , G06F18/253 , G06V10/764 , G06V10/806 , G06V10/82 , G06V10/454 , G06V10/255 , G06T2207/30252 , G06T2207/20084 , G06T2207/20081
Abstract: Various types of image analysis benefit from a multi-stream architecture that allows the analysis to consider shape data. A shape stream can process image data in parallel with a primary stream, where data from layers of a network in the primary stream is provided as input to a network of the shape stream. The shape data can be fused with the primary analysis data to produce more accurate output, such as to produce accurate boundary information when the shape data is used with semantic segmentation data produced by the primary stream. A gate structure can be used to connect the intermediate layers of the primary and shape streams, using higher level activations to gate lower level activations in the shape stream. Such a gate structure can help focus the shape stream on the relevant information and reduces any additional weight of the shape stream.
-
公开(公告)号:US20210326694A1
公开(公告)日:2021-10-21
申请号:US16852944
申请日:2020-04-20
Applicant: Nvidia Corporation
Inventor: Jialiang Wang , Varun Jampani , Stan Birchfield , Charles Loop , Jan Kautz
Abstract: Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.
-
公开(公告)号:US10986325B2
公开(公告)日:2021-04-20
申请号:US16569104
申请日:2019-09-12
Applicant: NVIDIA Corporation
Inventor: Deqing Sun , Varun Jampani , Erik Gundersen Learned-Miller , Huaizu Jiang
IPC: H04N13/122 , H04N13/128 , G06N3/08 , H04N13/00
Abstract: Scene flow represents the three-dimensional (3D) structure and movement of objects in a video sequence in three dimensions from frame-to-frame and is used to track objects and estimate speeds for autonomous driving applications. Scene flow is recovered by a neural network system from a video sequence captured from at least two viewpoints (e.g., cameras), such as a left-eye and right-eye of a viewer. An encoder portion of the system extracts features from frames of the video sequence. The features are input to a first decoder to predict optical flow and a second decoder to predict disparity. The optical flow represents pixel movement in (x,y) and the disparity represents pixel movement in z (depth). When combined, the optical flow and disparity represent the scene flow.
-
-
-
-
-
-
-
-
-