Patent search ap:("NVIDIA Corporation") AND inv:"Bryan Catanzaro" Page 3

21.

发明授权
Unsupervised alignment for text to speech synthesis using neural networks 有权

公开(公告)号：US11769481B2

公开(公告)日：2023-09-26

申请号：US17496569

申请日：2021-10-07

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/00 , G10L13/10 , G10L13/06 , G10L13/07 , G10L13/047 , G10L25/90 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/08

CPC classification number: G10L13/047 , G06N3/045 , G06N3/08 , G10L13/0335 , G10L13/08 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

22.

发明申请
TRAINING ONE OR MORE NEURAL NETWORKS USING SYNTHETIC DATA 有权

公开(公告)号：US20220130013A1

公开(公告)日：2022-04-28

申请号：US17080503

申请日：2020-10-26

Applicant: Nvidia Corporation

Inventor： Robert Pottorff , Shiqiu Liu , Andrew Tao , Bryan Catanzaro

IPC: G06T3/40 , G06T11/60 , G06T5/00 , G06T5/50

Abstract: Apparatuses, systems, and techniques are presented to train one or more neural networks. In at least one embodiment, one or more neural networks are trained based, at least in part, on two or more versions of an image, wherein each of the two or more versions of the image are to be synthetically generated independently.

23.

发明授权
Performing multi-convolution operations in a parallel processing system 有权

公开(公告)号：US10223333B2

公开(公告)日：2019-03-05

申请号：US14838291

申请日：2015-08-27

Applicant: NVIDIA CORPORATION

Inventor： Sharanyan Chetlur , Bryan Catanzaro

IPC: G06F17/15

Abstract: In one embodiment of the present invention a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch. Notably, the source locations reflect the contribution of the image tile to an output tile of an output matrix—the result of the multi-convolution operation. Subsequently, the pipeline copies data from the source locations to the image tile. Similarly, the pipeline copies data from a filter stack to a filter tile. The pipeline then performs matrix multiplication operations between the image tile and the filter tile to generate data included in the corresponding output tile. To optimize both on-chip memory usage and execution time, the pipeline creates each image tile in on-chip memory as-needed.

24.

发明授权
Unsupervised alignment for text to speech synthesis using neural networks 有权

公开(公告)号：US11869483B2

公开(公告)日：2024-01-09

申请号：US17496636

申请日：2021-10-07

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/00 , G10L13/08 , G10L13/10 , G10L13/047 , G10L25/90 , G06N3/045 , G06N3/08 , G10L13/033

CPC classification number: G10L13/047 , G06N3/045 , G06N3/08 , G10L13/0335 , G10L13/08 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

25.

发明公开
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20230402028A1

公开(公告)日：2023-12-14

申请号：US18457221

申请日：2023-08-28

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/047 , G10L13/033 , G10L13/08 , G06N3/08 , G06N3/045 , G10L25/90

CPC classification number: G10L13/047 , G10L13/0335 , G10L13/08 , G06N3/08 , G06N3/045 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

26.

发明申请
UPSAMPLING AN IMAGE USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20220114700A1

公开(公告)日：2022-04-14

申请号：US17066282

申请日：2020-10-08

Applicant: Nvidia Corporation

Inventor： Shiqiu Liu , Robert Pottorff , Guilin Liu , Karan Sapra , Jon Barker , David Tarjan , Pekka Janis , Edvard Fagerholm , Lei Yang , Kevin Shih , Marco Salvi , Timo Roman , Andrew Tao , Bryan Catanzaro

IPC: G06T3/40 , G06T5/50 , G06T5/20 , G06T5/00

Abstract: Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images using one or more pixel weights determined based, at least in part, on one or more sub-pixel offset values.

27.

发明申请
IMAGE ENHANCEMENT USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20210383505A1

公开(公告)日：2021-12-09

申请号：US17406922

申请日：2021-08-19

Applicant: Nvidia Corporation

Inventor： Robert Pottorff , David Tarjan , Andrew Tao , Bryan Catanzaro

IPC: G06T3/40 , G06T5/00 , G06T15/50 , G06T5/20

Abstract: Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.

28.

发明申请
VIDEO PREDICTION USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20210064925A1

公开(公告)日：2021-03-04

申请号：US16558620

申请日：2019-09-03

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Aysegul Dundar , Animesh Garg , Robert Pottorff , Andrew Tao , Bryan Catanzaro

IPC: G06K9/62 , G06N3/04 , G06N3/08

Abstract: Apparatuses, systems, and techniques to enhance video are disclosed. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having one or more additional video frames.

29.

发明申请
VIDEO PREDICTION USING SPATIALLY DISPLACED CONVOLUTION 审中-公开

公开(公告)号：US20190297326A1

公开(公告)日：2019-09-26

申请号：US16360853

申请日：2019-03-21

Applicant: NVIDIA Corporation

Inventor： Fitsum A. Reda , Guilin Liu , Kevin Shih , Robert Kirby , Jonathan Barker , David Tarjan , Andrew Tao , Bryan Catanzaro

IPC: H04N19/139 , G06N3/08 , G06N20/10 , G06N3/04 , G06N20/20 , H04N19/587 , H04N19/132 , H04N19/172

Abstract: A neural network architecture is disclosed for performing video frame prediction using a sequence of video frames and corresponding pairwise optical flows. The neural network processes the sequence of video frames and optical flows utilizing three-dimensional convolution operations, where time (or multiple video frames in the sequence of video frames) provides the third dimension in addition to the two-dimensional pixel space of the video frames. The neural network generates a set of parameters used to predict a next video frame in the sequence of video frames by sampling a previous video frame utilizing spatially-displaced convolution operations. In one embodiment, the set of parameters includes a displacement vector and at least one convolution kernel per pixel. Generating a pixel value in the next video frame includes applying the convolution kernel to a corresponding patch of pixels in the previous video frame based on the displacement vector.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification