-
公开(公告)号:US11631239B2
公开(公告)日:2023-04-18
申请号:US17237728
申请日:2021-04-22
Applicant: NVIDIA Corporation
Inventor: Xiaodong Yang , Ming-Yu Liu , Jan Kautz , Fanyi Xiao , Xitong Yang
Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.
-
公开(公告)号:US11610115B2
公开(公告)日:2023-03-21
申请号:US16685795
申请日:2019-11-15
Applicant: NVIDIA Corporation
Inventor: Amlan Kar , Aayush Prakash , Ming-Yu Liu , David Jesus Acuna Marrero , Antonio Torralba Barriuso , Sanja Fidler
IPC: G06N3/08 , G06F16/901 , G06T11/60 , G06N3/04
Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar—and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.
-
公开(公告)号:US20230035306A1
公开(公告)日:2023-02-02
申请号:US17382027
申请日:2021-07-21
Applicant: Nvidia Corporation
Inventor: Ming-Yu Liu , Koki Nagano , Yeongho Seol , Jose Rafael Valle Gomes da Costa , Jaewoo Seo , Ting-Chun Wang , Arun Mallya , Sameh Khamis , Wei Ping , Rohan Badlani , Kevin Jonathan Shih , Bryan Catanzaro , Simon Yuen , Jan Kautz
Abstract: Apparatuses, systems, and techniques are presented to generate media content. In at least one embodiment, a first neural network is used to generate first video information based, at least in part, upon voice information corresponding to one or more users, and a second neural network is used to generate second video information corresponding to the one or more users based, at least in part, upon the first video information and one or more images corresponding to the one or more users
-
公开(公告)号:US20220237838A1
公开(公告)日:2022-07-28
申请号:US17159977
申请日:2021-01-27
Applicant: Nvidia Corporation
Inventor: Ming-Yu Liu , Xun Huang
Abstract: Apparatuses, systems, and techniques are presented to synthesize representations. In at least one embodiment, one or more neural networks are used to generate one or more representations of one or more objects based, at least in part, upon one or more structural features and one or more appearance features for the one or more objects.
-
公开(公告)号:US20220180602A1
公开(公告)日:2022-06-09
申请号:US17111271
申请日:2020-12-03
Applicant: Nvidia Corporation
Inventor: Zekun Hao , Ming-Yu Liu , Arun Mohanray Mallya
Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, upon one or more semantic features projected from a three-dimensional environment.
-
公开(公告)号:US20220108417A1
公开(公告)日:2022-04-07
申请号:US17061041
申请日:2020-10-01
Applicant: Nvidia Corporation
Inventor: Ming-Yu Liu , Xun Huang
Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, upon speech input received from one or more users.
-
公开(公告)号:US20210150187A1
公开(公告)日:2021-05-20
申请号:US17143516
申请日:2021-01-07
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko T. Lehtinen , Miika Samuli Aittala , Timo Oskari Aila , Ming-Yu Liu , Arun Mohanray Mallya , Ting-Chun Wang
Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
-
公开(公告)号:US10922793B2
公开(公告)日:2021-02-16
申请号:US16353195
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Seung-Hwan Baek , Kihwan Kim , Jinwei Gu , Orazio Gallo , Alejandro Jose Troccoli , Ming-Yu Liu , Jan Kautz
Abstract: Missing image content is generated using a neural network. In an embodiment, a high resolution image and associated high resolution semantic label map are generated from a low resolution image and associated low resolution semantic label map. The input image/map pair (low resolution image and associated low resolution semantic label map) lacks detail and is therefore missing content. Rather than simply enhancing the input image/map pair, data missing in the input image/map pair is improvised or hallucinated by a neural network, creating plausible content while maintaining spatio-temporal consistency. Missing content is hallucinated to generate a detailed zoomed in portion of an image. Missing content is hallucinated to generate different variations of an image, such as different seasons or weather conditions for a driving video.
-
79.
公开(公告)号:US20200242771A1
公开(公告)日:2020-07-30
申请号:US16258322
申请日:2019-01-25
Applicant: Nvidia Corporation
Inventor: Taesung Park , Ming-Yu Liu , Ting-Chun Wang , Junyan Zhu
Abstract: A user can create a basic semantic layout that includes two or more regions identified by the user, each region being associated with a semantic label indicating a type of object(s) to be rendered in that region. The semantic layout can be provided as input to an image synthesis network. The network can be a trained machine learning network, such as a generative adversarial network (GAN), that includes a conditional, spatially-adaptive normalization layer for propagating semantic information from the semantic layout to other layers of the network. The synthesis can involve both normalization and de-normalization, where each region of the layout can utilize different normalization parameter values. An image is inferred from the network, and rendered for display to the user. The user can change labels or regions in order to cause a new or updated image to be generated.
-
公开(公告)号:US10424069B2
公开(公告)日:2019-09-24
申请号:US15942213
申请日:2018-03-30
Applicant: NVIDIA Corporation
Inventor: Deqing Sun , Xiaodong Yang , Ming-Yu Liu , Jan Kautz
Abstract: A method, computer readable medium, and system are disclosed for estimating optical flow between two images. A first pyramidal set of features is generated for a first image and a partial cost volume for a level of the first pyramidal set of features is computed, by a neural network, using features at the level of the first pyramidal set of features and warped features extracted from a second image, where the partial cost volume is computed across a limited range of pixels that is less than a full resolution of the first image, in pixels, at the level. The neural network processes the features and the partial cost volume to produce a refined optical flow estimate for the first image and the second image.
-
-
-
-
-
-
-
-
-