Patent search ap:("NVIDIA Corporation") AND inv:"Kevin Jonathan Shih" Page 1

1.

发明申请
UPSAMPLING AN IMAGE USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20220114702A1

公开(公告)日：2022-04-14

申请号：US17406902

申请日：2021-08-19

Applicant: Nvidia Corporation

Inventor： Shiqiu Liu , Robert Pottorff , Guilin Liu , Karan Sapra , Jon Barker , David Tarjan , Pekka Janis , Edvard Fagerholm , Lei Yang , Kevin Jonathan Shih , Marco Salvi , Timo Roman , Andrew Tao , Bryan Catanzaro

IPC: G06T3/40 , G06T5/00 , G06T5/50 , G06T5/20

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights.

2.

发明申请
UPSAMPLING AN IMAGE USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20220222778A1

公开(公告)日：2022-07-14

申请号：US17710643

申请日：2022-03-31

Applicant: NVIDIA Corporation

Inventor： Shiqiu Liu , Robert Thomas Pottorff , Guilin Liu , Karan Sapra , Jon Barker , David Tarjan , Pekka Janis , Edvard Olav Valter Fagerholm , Lei Yang , Kevin Jonathan Shih , Marco Salvi , Timo Roman , Andrew Tao , Bryan Christopher Catanzaro

IPC: G06T3/40 , G06T5/20 , G06T5/50 , G06T5/00

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.

3.

发明申请
SYNTHESIZING SPEECH IN MULTIPLE LANGUAGES IN CONVERSATIONAL AI SYSTEMS AND APPLICATIONS 有权

公开(公告)号：US20250118286A1

公开(公告)日：2025-04-10

申请号：US18483342

申请日：2023-10-09

Applicant: NVIDIA Corporation

Inventor： Rohan Badlani , José Rafael Valle Gomves da Costa , Kevin Jonathan Shih , Bryan Catanzaro

IPC: G10L13/047 , G10L13/08 , G10L13/10 , G10L17/02 , G10L25/18

Abstract: In various examples, synthesizing speech in multiple languages in conversational AI systems and applications is described herein. Systems and methods are disclosed that use one or more models to synthesize speech from a first language spoken by a speaker to a second, target language selected by the speaker. In some examples, to perform the translation, the model(s) may disentangle one or more attributes associated with speech from speakers, such as speakers' identities, speakers' accents, and text associated with the speech. Additionally, the model(s) may allow for fine-grained control of additional attributes associated with output speech, such as one or more frequencies, one or more energies, and one or more phoneme durations. Furthermore, the model(s) may be configured to use the accent associated with the target language when generating text, such as when aligning text encodings with one or more phonemes.

4.

发明申请
SYNTHESIZING VIDEO FROM AUDIO USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20230035306A1

公开(公告)日：2023-02-02

申请号：US17382027

申请日：2021-07-21

Applicant: Nvidia Corporation

Inventor： Ming-Yu Liu , Koki Nagano , Yeongho Seol , Jose Rafael Valle Gomes da Costa , Jaewoo Seo , Ting-Chun Wang , Arun Mallya , Sameh Khamis , Wei Ping , Rohan Badlani , Kevin Jonathan Shih , Bryan Catanzaro , Simon Yuen , Jan Kautz

IPC: G06T13/40 , H04N19/597 , G06N3/04 , G10L13/04 , G06T9/00 , G06T17/10 , G06T13/20

Abstract: Apparatuses, systems, and techniques are presented to generate media content. In at least one embodiment, a first neural network is used to generate first video information based, at least in part, upon voice information corresponding to one or more users, and a second neural network is used to generate second video information corresponding to the one or more users based, at least in part, upon the first video information and one or more images corresponding to the one or more users

5.

发明申请
DISENTANGLEMENT OF IMAGE ATTRIBUTES USING A NEURAL NETWORK 有权

公开(公告)号：US20220180528A1

公开(公告)日：2022-06-09

申请号：US17678666

申请日：2022-02-23

Applicant: NVIDIA Corporation

Inventor： Aysegul Dundar , Kevin Jonathan Shih , Animesh Garg , Robert Thomas Pottorff , Andrew Tao , Bryan Christopher Catanzaro

IPC: G06T7/194 , G06N20/00 , G06N5/04 , G06T7/70 , G06N3/08 , G06V20/20

Abstract: Apparatuses, systems, and techniques to perform unsupervised keypoint or landmark learning using one or more neural networks. In at least one embodiment, one or more neural networks use pose and appearance information to construct a foreground and a background, which are then used to reconstruct an input image and determine loss values to train the one or more neural networks.

Patent Agency Ranking