Patent search ap:("Google LLC") AND inv:"William Chan" Page 2

11.

发明申请
Image Enhancement via Iterative Refinement based on Machine Learning Models 有权

公开(公告)号：US20250061551A1

公开(公告)日：2025-02-20

申请号：US18939994

申请日：2024-11-07

Applicant: Google LLC

Inventor： Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi

IPC: G06T5/70 , G06N3/045 , G06N3/08 , G06T3/4007 , G06T5/50

Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.

12.

发明公开
Image Enhancement via Iterative Refinement based on Machine Learning Models 审中-公开

公开(公告)号：US20230153959A1

公开(公告)日：2023-05-18

申请号：US18155420

申请日：2023-01-17

Applicant: Google LLC

Inventor： Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi

IPC: G06T5/00 , G06N3/08 , G06N3/045 , G06T5/50 , G06T3/40

CPC classification number: G06T5/002 , G06N3/08 , G06N3/045 , G06T5/50 , G06T3/4007 , G06T2207/20081 , G06T2207/20016 , G06T2207/20084

Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.

13.

发明申请
SEQUENCE MODELING USING IMPUTATION 有权

公开(公告)号：US20230075716A1

公开(公告)日：2023-03-09

申请号：US17797872

申请日：2021-02-08

Applicant: Google LLC

Inventor： William Chan , Chitwan Saharia , Geoffrey E. Hinton , Mohammad Norouzi , Navdeep Jaitly

IPC: G06F40/47 , G06F40/284

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for sequence modeling. One of the methods includes receiving an input sequence having a plurality of input positions; determining a plurality of blocks of consecutive input positions; processing the input sequence using a neural network to generate a latent alignment, comprising, at each of a plurality of input time steps: receiving a partial latent alignment from a previous input time step; selecting an input position in each block, wherein the token at the selected input position of the partial latent alignment in each block is a mask token; and processing the partial latent alignment and the input sequence using the neural network to generate a new latent alignment, wherein the new latent alignment comprises, at the selected input position in each block, an output token or a blank token; and generating, using the latent alignment, an output sequence.

14.

发明申请
SPEECH RECOGNITION WITH ATTENTION-BASED RECURRENT NEURAL NETWORKS 有权

公开(公告)号：US20220028375A1

公开(公告)日：2022-01-27

申请号：US17450235

申请日：2021-10-07

Applicant: Google LLC

Inventor： William Chan , Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Noam M. Shazeer

IPC: G10L15/16 , G06N3/04 , G06F40/12 , G06F40/197 , G10L15/183 , G10L15/26

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

15.

发明授权
Very deep convolutional neural networks for end-to-end speech recognition 有权

公开(公告)号：US11080599B2

公开(公告)日：2021-08-03

申请号：US16692538

申请日：2019-11-22

Applicant: Google LLC

Inventor： Navdeep Jaitly , Yu Zhang , William Chan

IPC: G06N3/08 , G10L15/16 , G06N3/04 , G10L15/02 , G10L15/22

Abstract: A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of sub string scores that includes a respective sub string score for each substring in a set of substrings.

16.

发明申请
GENERATING NEURAL NETWORK OUTPUTS USING INSERTION COMMANDS 审中-公开

公开(公告)号：US20200372356A1

公开(公告)日：2020-11-26

申请号：US16883772

申请日：2020-05-26

Applicant: Google LLC

Inventor： William Chan , Mitchell Thomas Stern , Nikita Kitaev , Kelvin Gu , Jakob D. Uszkoreit

IPC: G06N3/08 , G06N3/04 , G06F40/237

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing sequence modeling tasks using insertions. One of the methods includes receiving a system input that includes one or more source elements from a source sequence and zero or more target elements from a target sequence, wherein each source element is selected from a vocabulary of source elements and wherein each target element is selected from a vocabulary of target elements; generating a partial concatenated sequence that includes the one or more source elements from the source sequence and the zero or more target elements from the target sequence, wherein the source and target elements arranged in the partial concatenated sequence according to a combined order; and generating a final concatenated sequence that includes a finalized source sequence and a finalized target sequence, wherein the finalized target sequence includes one or more target elements.

17.

发明授权
Image enhancement via iterative refinement based on machine learning models 有权

公开(公告)号：US12165289B2

公开(公告)日：2024-12-10

申请号：US18227120

申请日：2023-07-27

Applicant: Google LLC

Inventor： Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi

IPC: G06T5/70 , G06N3/045 , G06N3/08 , G06T3/4007 , G06T5/50

Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.

18.

发明授权
Generating neural network outputs using insertion operations 有权

公开(公告)号：US12106064B2

公开(公告)日：2024-10-01

申请号：US18082357

申请日：2022-12-15

Applicant: Google LLC

Inventor： Jakob D. Uszkoreit , Mitchell Thomas Stern , Jamie Ryan Kiros , William Chan

IPC: G10L15/22 , G06F40/44 , G06N3/044 , G06N3/08 , G06N5/04

CPC classification number: G06F40/44 , G06N3/044 , G06N3/08 , G06N5/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating network outputs using insertion operations.

19.

发明公开
GENERATING IMAGES USING SEQUENCES OF GENERATIVE NEURAL NETWORKS 审中-公开

公开(公告)号：US20240249456A1

公开(公告)日：2024-07-25

申请号：US18624960

申请日：2024-04-02

Applicant: Google LLC

Inventor： Chitwan Saharia , William Chan , Mohammad Norouzi , Saurabh Saxena , Yi Li , Jay Ha Whang , David James Fleet , Jonathan Ho

IPC: G06T11/60 , G06F40/284 , G06F40/40 , G06N3/08 , G06T3/4053 , G06T5/70

CPC classification number: G06T11/60 , G06F40/284 , G06F40/40 , G06N3/08 , G06T3/4053 , G06T5/70

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating images. In one aspect, a method includes: receiving an input text prompt including a sequence of text tokens in a natural language; processing the input text prompt using a text encoder neural network to generate a set of contextual embeddings of the input text prompt; and processing the contextual embeddings through a sequence of generative neural networks to generate a final output image that depicts a scene that is described by the input text prompt.

20.

发明公开
Augmentation of Audiographic Images for Improved Machine Learning 审中-公开

公开(公告)号：US20230359898A1

公开(公告)日：2023-11-09

申请号：US18350464

申请日：2023-07-11

Applicant: Google LLC

Inventor： Daniel Sung-Joon Park , Quoc Le , William Chan , Ekin Dogus Cubuk , Barret Zoph , Yu Zhang , Chung-Cheng Chiu

IPC: G06V10/774 , G06N20/00 , G10L15/16 , G10L15/06 , G10L15/12 , G10L15/28 , G06V10/82

CPC classification number: G06N3/084 , G06N20/00 , G10L15/16 , G10L15/063 , G10L15/12 , G06V10/7747 , G10L15/28 , G06V10/82 , G06F18/2148

Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification