-
公开(公告)号:US11593661B2
公开(公告)日:2023-02-28
申请号:US16389832
申请日:2019-04-19
Applicant: NVIDIA Corporation
Inventor: Seonwook Park , Shalini De Mello , Pavlo Molchanov , Umar Iqbal , Jan Kautz
Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.
-
公开(公告)号:US20230004760A1
公开(公告)日:2023-01-05
申请号:US17361202
申请日:2021-06-28
Applicant: NVIDIA Corporation
Inventor: Siva Karthik Mustikovela , Shalini De Mello , Aayush Prakash , Umar Iqbal , Sifei Liu , Jan Kautz
IPC: G06K9/62
Abstract: Apparatuses, systems, and techniques to identify objects within an image using self-supervised machine learning. In at least one embodiment, a machine learning system is trained to recognize objects by training a first network to recognize objects within images that are generated by a second network. In at least one embodiment, the second network is a controllable network.
-
公开(公告)号:US20220270318A1
公开(公告)日:2022-08-25
申请号:US17734244
申请日:2022-05-02
Applicant: NVIDIA Corporation
Inventor: Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Jan Kautz
Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.
-
公开(公告)号:US20220222832A1
公开(公告)日:2022-07-14
申请号:US17570254
申请日:2022-01-06
Applicant: NVIDIA Corporation
Inventor: Yang Fu , Sifei Liu , Umar Iqbal , Shalini De Mello , Jan Kautz
IPC: G06T7/246 , G06T7/73 , G06T7/11 , G06V10/764 , G06V10/774 , G06V10/77 , G06V10/82
Abstract: A method and system are provided for tracking instances within a sequence of video frames. The method includes the steps of processing an image frame by a backbone network to generate a set of feature maps, processing the set of feature maps by one or more prediction heads, and analyzing the embedding features corresponding to a set of instances in two or more image frames of the sequence of video frames to establish a one-to-one correlation between instances in different image frames. The one or more prediction heads includes an embedding head configured to generate a set of embedding features corresponding to one or more instances of an object identified in the image frame. The method may also include training the one or more prediction heads using a set of annotated image frames and/or a plurality of sequences of unlabeled video frames.
-
公开(公告)号:US11328169B2
公开(公告)日:2022-05-10
申请号:US16353835
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
公开(公告)号:US20210073575A1
公开(公告)日:2021-03-11
申请号:US17081805
申请日:2020-10-27
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
公开(公告)号:US20200334543A1
公开(公告)日:2020-10-22
申请号:US16389832
申请日:2019-04-19
Applicant: NVIDIA Corporation
Inventor: Seonwook Park , Shalini De Mello , Pavlo Molchanov , Umar Iqbal , Jan Kautz
Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.
-
公开(公告)号:US20190213439A1
公开(公告)日:2019-07-11
申请号:US16353835
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
CPC classification number: G06K9/6215 , G06K9/00744 , G06K9/6256 , G06N3/04 , G06N3/08 , G06N3/084 , G06T5/009 , G06T5/50 , G06T7/10 , G06T7/90 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/20208
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
公开(公告)号:US20190180469A1
公开(公告)日:2019-06-13
申请号:US15836549
申请日:2017-12-08
Applicant: NVIDIA Corporation
Inventor: Jinwei Gu , Xiaodong Yang , Shalini De Mello , Jan Kautz
CPC classification number: G06T7/73 , G06N3/08 , G06T3/4046 , G06T13/40 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30201 , G06T2207/30204
Abstract: A method, computer readable medium, and system are disclosed for dynamic facial analysis. The method includes the steps of receiving video data representing a sequence of image frames including at least one head and extracting, by a neural network, spatial features comprising pitch, yaw, and roll angles of the at least one head from the video data. The method also includes the step of processing, by a recurrent neural network, the spatial features for two or more image frames in the sequence of image frames to produce head pose estimates for the at least one head.
-
公开(公告)号:US12169882B2
公开(公告)日:2024-12-17
申请号:US17929182
申请日:2022-09-01
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Jiteng Mu , Shalini De Mello , Zhiding Yu , Jan Kautz
Abstract: Embodiments of the present disclosure relate to learning dense correspondences for images. Systems and methods are disclosed that disentangle structure and texture (or style) representations of GAN synthesized images by learning a dense pixel-level correspondence map for each image during image synthesis. A canonical coordinate frame is defined and a structure latent code for each generated image is warped to align with the canonical coordinate frame. In sum, the structure associated with the latent code is mapped into a shared coordinate space (canonical coordinate space), thereby establishing correspondences in the shared coordinate space. A correspondence generation system receives the warped coordinate correspondences as an encoded image structure. The encoded image structure and a texture latent code are used to synthesize an image. The shared coordinate space enables propagation of semantic labels from reference images to synthesized images.
-
-
-
-
-
-
-
-
-