-
公开(公告)号:US20230015989A1
公开(公告)日:2023-01-19
申请号:US17365877
申请日:2021-07-01
Applicant: Nvidia Corporation
Inventor: Zhiding Yu , Rui Huang , Wonmin Byeon , Sifei Liu , Guilin Liu , Thomas Breuel , Anima Anandkumar , Jan Kautz
Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.
-
公开(公告)号:US20210279841A1
公开(公告)日:2021-09-09
申请号:US16813589
申请日:2020-03-09
Applicant: NVIDIA Corporation
Inventor: Guilin Liu , Andrew Tao , Bryan Christopher Catanzaro , Ting-Chun Wang , Zhiding Yu , Shiqiu Liu , Fitsum Reda , Karan Sapra , Brandon Rowlett
Abstract: Apparatuses, systems, and techniques for texture synthesis from small input textures in images using convolutional neural networks. In at least one embodiment, one or more convolutional layers are used in conjunction with one or more transposed convolution operations to generate a large textured output image from a small input textured image while preserving global features and texture, according to various novel techniques described herein.
-
公开(公告)号:US20210067735A1
公开(公告)日:2021-03-04
申请号:US16559312
申请日:2019-09-03
Applicant: Nvidia Corporation
Inventor: Fitsum Reda , Deqing Sun , Aysegul Dundar , Mohammad Shoeybi , Guilin Liu , Kevin Shih , Andrew Tao , Jan Kautz , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.
-
公开(公告)号:US20240095880A1
公开(公告)日:2024-03-21
申请号:US17948138
申请日:2022-09-19
Applicant: NVIDIA Corporation
Inventor: Shiqiu Liu , Jussi Rasanen , Michael Ranzinger , Guilin Liu , Andrew Tao , Bryan Christopher Catanzaro
CPC classification number: G06T3/4046 , G06T5/002 , G06T5/50 , G06T2207/20081 , G06T2207/20084 , G06T2207/20212
Abstract: Apparatuses, systems, and techniques to use one or more neural networks to generate an upsampled version of one or more images based, at least in part, on a denoised version of said one or more images. At least one embodiment pertains to generating an upsampled high-resolution image from a noisy version and denoised version of a low-resolution image. At least one embodiment pertains to separating components of a low-resolution image before denoising an image.
-
公开(公告)号:US20220114700A1
公开(公告)日:2022-04-14
申请号:US17066282
申请日:2020-10-08
Applicant: Nvidia Corporation
Inventor: Shiqiu Liu , Robert Pottorff , Guilin Liu , Karan Sapra , Jon Barker , David Tarjan , Pekka Janis , Edvard Fagerholm , Lei Yang , Kevin Shih , Marco Salvi , Timo Roman , Andrew Tao , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.
-
公开(公告)号:US20190297326A1
公开(公告)日:2019-09-26
申请号:US16360853
申请日:2019-03-21
Applicant: NVIDIA Corporation
Inventor: Fitsum A. Reda , Guilin Liu , Kevin Shih , Robert Kirby , Jonathan Barker , David Tarjan , Andrew Tao , Bryan Catanzaro
IPC: H04N19/139 , G06N3/08 , G06N20/10 , G06N3/04 , G06N20/20 , H04N19/587 , H04N19/132 , H04N19/172
Abstract: A neural network architecture is disclosed for performing video frame prediction using a sequence of video frames and corresponding pairwise optical flows. The neural network processes the sequence of video frames and optical flows utilizing three-dimensional convolution operations, where time (or multiple video frames in the sequence of video frames) provides the third dimension in addition to the two-dimensional pixel space of the video frames. The neural network generates a set of parameters used to predict a next video frame in the sequence of video frames by sampling a previous video frame utilizing spatially-displaced convolution operations. In one embodiment, the set of parameters includes a displacement vector and at least one convolution kernel per pixel. Generating a pixel value in the next video frame includes applying the convolution kernel to a corresponding patch of pixels in the previous video frame based on the displacement vector.
-
公开(公告)号:US11790633B2
公开(公告)日:2023-10-17
申请号:US17365877
申请日:2021-07-01
Applicant: Nvidia Corporation
Inventor: Zhiding Yu , Rui Huang , Wonmin Byeon , Sifei Liu , Guilin Liu , Thomas Breuel , Anima Anandkumar , Jan Kautz
IPC: G06V10/50 , G06N3/04 , G06T7/13 , G06V10/75 , G06F18/2413
CPC classification number: G06V10/50 , G06F18/2413 , G06N3/04 , G06T7/13 , G06V10/758
Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.
-
公开(公告)号:US20220114702A1
公开(公告)日:2022-04-14
申请号:US17406902
申请日:2021-08-19
Applicant: Nvidia Corporation
Inventor: Shiqiu Liu , Robert Pottorff , Guilin Liu , Karan Sapra , Jon Barker , David Tarjan , Pekka Janis , Edvard Fagerholm , Lei Yang , Kevin Jonathan Shih , Marco Salvi , Timo Roman , Andrew Tao , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights.
-
公开(公告)号:US20220114701A1
公开(公告)日:2022-04-14
申请号:US17172330
申请日:2021-02-10
Applicant: Nvidia Corporation
Inventor: Shiqiu Liu , Robert Pottorff , Guilin Liu , Karan Sapra , Jon Barker , David Tarjan , Pekka Janis , Edvard Fagerholm , Lei Yang , Kevin Shih , Marco Salvi , Timo Roman , Andrew Tao , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.
-
公开(公告)号:US11295514B2
公开(公告)日:2022-04-05
申请号:US16685538
申请日:2019-11-15
Applicant: NVIDIA Corporation
Inventor: Jinwei Gu , Kihwan Kim , Jan Kautz , Guilin Liu , Soumyadip Sengupta
Abstract: Inverse rendering estimates physical scene attributes (e.g., reflectance, geometry, and lighting) from image(s) and is used for gaming, virtual reality, augmented reality, and robotics. An inverse rendering network (IRN) receives a single input image of a 3D scene and generates the physical scene attributes for the image. The IRN is trained by using the estimated physical scene attributes generated by the IRN to reproduce the input image and updating parameters of the IRN to reduce differences between the reproduced input image and the input image. A direct renderer and a residual appearance renderer (RAR) reproduce the input image. The RAR predicts a residual image representing complex appearance effects of the real (not synthetic) image based on features extracted from the image and the reflectance and geometry properties. The residual image represents near-field illumination, cast shadows, inter-reflections, and realistic shading that are not provided by the direct renderer.
-
-
-
-
-
-
-
-
-