-
公开(公告)号:US20220138903A1
公开(公告)日:2022-05-05
申请号:US17089445
申请日:2020-11-04
Applicant: Nvidia Corporation
Inventor: Shiqiu Liu , Robert Pottorff , Andrew Tao , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques are presented to train one or more neural networks. In at least one embodiment, one or more neural networks are trained based, at least in part, on one or more image sequences, where backpropagation is performed using one or more subsets of images from the one or more image sequences.
-
公开(公告)号:US20210067735A1
公开(公告)日:2021-03-04
申请号:US16559312
申请日:2019-09-03
Applicant: Nvidia Corporation
Inventor: Fitsum Reda , Deqing Sun , Aysegul Dundar , Mohammad Shoeybi , Guilin Liu , Kevin Shih , Andrew Tao , Jan Kautz , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.
-
公开(公告)号:US20230139623A1
公开(公告)日:2023-05-04
申请号:US17517612
申请日:2021-11-02
Applicant: NVIDIA Corporation
Inventor: Rajarshi Roy , Saad Godil , Jonathan Raiman , Neel Kant , Ilyas Elkin , Ming Y. Siu , Robert Kirby , Stuart Oberman , Bryan Catanzaro
IPC: G06F30/394 , G06N20/00 , G06F30/327
Abstract: Apparatuses, systems, and techniques for designing a data path circuit such as a parallel prefix circuit with reinforcement learning are described. A method can include receiving a first design state of a data path circuit, inputting the first design state of the data path circuit into a machine learning model, and performing reinforcement learning using the machine learning model to output a final design state of the data path circuit, wherein the final design state of the data path circuit has decreased area, power consumption and/or delay as compared to conventionally designed data path circuits.
-
公开(公告)号:US20220405987A1
公开(公告)日:2022-12-22
申请号:US17351303
申请日:2021-06-18
Applicant: Nvidia Corporation
Inventor: Robert Pottorff , Karan Sapra , Andrew Tao , Bryan Catanzaro , Jarmo Lunden
Abstract: Apparatuses, systems, and techniques are presented to generate one or more images. In at least one embodiment, two or more pixels from two or more images are blended based, at least in part, on a distance of the two or more pixels from a region of the two or more images, in which pixel colors are substantially similar.
-
公开(公告)号:US20220067879A1
公开(公告)日:2022-03-03
申请号:US17012000
申请日:2020-09-03
Applicant: Nvidia Corporation
Inventor: Robert Pottorff , David Tarjan , Andrew Tao , Bryan Catanzaro
Abstract: Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.
-
公开(公告)号:US20190295228A1
公开(公告)日:2019-09-26
申请号:US16360895
申请日:2019-03-21
Applicant: NVIDIA Corporation
Inventor: Guilin Liu , Fitsum A. Reda , Kevin Shih , Ting-Chun Wang , Andrew Tao , Bryan Catanzaro
Abstract: A neural network architecture is disclosed for performing image in-painting using partial convolution operations. The neural network processes an image and a corresponding mask that identifies holes in the image utilizing partial convolution operations, where the mask is used by the partial convolution operation to zero out coefficients of the convolution kernel corresponding to invalid pixel data for the holes. The mask is updated after each partial convolution operation is performed in an encoder section of the neural network. In one embodiment, the neural network is implemented using an encoder-decoder framework with skip links to forward representations of the features at different sections of the encoder to corresponding sections of the decoder.
-
公开(公告)号:US20250118286A1
公开(公告)日:2025-04-10
申请号:US18483342
申请日:2023-10-09
Applicant: NVIDIA Corporation
IPC: G10L13/047 , G10L13/08 , G10L13/10 , G10L17/02 , G10L25/18
Abstract: In various examples, synthesizing speech in multiple languages in conversational AI systems and applications is described herein. Systems and methods are disclosed that use one or more models to synthesize speech from a first language spoken by a speaker to a second, target language selected by the speaker. In some examples, to perform the translation, the model(s) may disentangle one or more attributes associated with speech from speakers, such as speakers' identities, speakers' accents, and text associated with the speech. Additionally, the model(s) may allow for fine-grained control of additional attributes associated with output speech, such as one or more frequencies, one or more energies, and one or more phoneme durations. Furthermore, the model(s) may be configured to use the accent associated with the target language when generating text, such as when aligning text encodings with one or more phonemes.
-
18.
公开(公告)号:US20240095460A1
公开(公告)日:2024-03-21
申请号:US17947491
申请日:2022-09-19
Applicant: NVIDIA Corporation
Inventor: Peng Xu , Mostofa Patwary , Rajath Shetty , Niral Lalit Pathak , Ratin Kumar , Bryan Catanzaro , Mohammad Shoeybi
IPC: G06F40/35
CPC classification number: G06F40/35
Abstract: In various examples, systems and methods that use dialogue systems associated with various machine systems and applications are described. For instance, the systems and methods may receive text data representing speech, such as a question associated with a vehicle or other machine type. The systems and methods then use a retrieval system(s) to retrieve a question/answer pair(s) associated with the text data and/or contextual information associated with the text data. In some examples, the contextual information is associated with a knowledge base associated with or corresponding to the vehicle. The systems and methods then generate a prompt using the text data, the question/answer pair(s), and/or the contextual information. Additionally, the systems and methods determine, using a language model(s) and based at least on the prompt, an output associated with the text data. For instance, the output may include information that answers the question associated with the vehicle.
-
公开(公告)号:US20230419947A1
公开(公告)日:2023-12-28
申请号:US18449969
申请日:2023-08-15
Applicant: Nvidia Corporation
Inventor: Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro
IPC: G10L13/047 , G10L25/90 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/08
CPC classification number: G10L13/047 , G10L25/90 , G10L13/08 , G06N3/08 , G10L13/0335 , G06N3/045
Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.
-
公开(公告)号:US11810268B2
公开(公告)日:2023-11-07
申请号:US17665412
申请日:2022-02-04
Applicant: Nvidia Corporation
Inventor: Robert Pottorff , David Tarjan , Andrew Tao , Bryan Catanzaro
CPC classification number: G06T3/4046 , G06T5/002 , G06T5/003 , G06T5/006 , G06T5/20 , G06T15/503
Abstract: Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.
-
-
-
-
-
-
-
-
-