Patent search ap:("Google LLC") AND inv:"William Chan" Page 4

31.

发明授权
Processing text using neural networks 有权

公开(公告)号：US11003856B2

公开(公告)日：2021-05-11

申请号：US16283632

申请日：2019-02-22

Applicant: Google LLC

Inventor： Jamie Ryan Kiros , William Chan , Geoffrey E. Hinton

IPC: G06F40/40 , G06F40/289 , G06F16/53 , G06K9/72 , G06N3/08

Abstract: Methods, systems, and apparatus including computer programs encoded on a computer storage medium, for generating a data set that associates each text segment in a vocabulary of text segments with a respective numeric embedding. In one aspect, a method includes providing, to an image search engine, a search query that includes the text segment; obtaining image search results that have been classified as being responsive to the search query by the image search engine, wherein each image search result identifies a respective image; for each image search result, processing the image identified by the image search result using a convolutional neural network, wherein the convolutional neural network has been trained to process the image to generate an image numeric embedding for the image; and generating a numeric embedding for the text segment from the image numeric embeddings for the images identified by the image search results.

32.

发明授权
Generating neural network outputs using insertion operations 有权

公开(公告)号：US10740571B1

公开(公告)日：2020-08-11

申请号：US16751167

申请日：2020-01-23

Applicant: Google LLC

Inventor： Jakob D. Uszkoreit , Mitchell Thomas Stern , Jamie Ryan Kiros , William Chan

IPC: G06F40/44 , G06N3/08 , G06N3/04 , G06N5/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating network outputs using insertion operations.

33.

发明申请
TRAINING SEQUENCE GENERATION NEURAL NETWORKS USING QUALITY SCORES 审中-公开

公开(公告)号：US20200151567A1

公开(公告)日：2020-05-14

申请号：US16746654

申请日：2020-01-17

Applicant: Google LLC

Inventor： Mohammad Norouzi , William Chan , Sara Sabour Rouh Aghdam

IPC: G06N3/08 , G10L25/30 , G10L15/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a sequence generation neural network. One of the methods includes obtaining a batch of training examples; for each of the training examples: processing the training network input in the training example using the neural network to generate an output sequence; for each particular output position in the output sequence: identifying a prefix that includes the system outputs at positions before the particular output position in the output sequence, for each possible system output in the vocabulary, determining a highest quality score that can be assigned to any candidate output sequence that includes the prefix followed by the possible system output, and determining an update to the current values of the network parameters that increases a likelihood that the neural network generates a system output at the position that has a high quality score.

34.

发明申请
VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR END-TO-END SPEECH RECOGNITION 审中-公开

公开(公告)号：US20200090044A1

公开(公告)日：2020-03-19

申请号：US16692538

申请日：2019-11-22

Applicant: Google LLC

Inventor： Navdeep Jaitly , Yu Zhang , William Chan

IPC: G06N3/08 , G06N3/04 , G10L15/16 , G10L15/02 , G10L15/22

Abstract: A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of sub string scores that includes a respective sub string score for each substring in a set of substrings.

35.

发明申请
TRAINING SEQUENCE GENERATION NEURAL NETWORKS USING QUALITY SCORES 审中-公开

公开(公告)号：US20190362229A1

公开(公告)日：2019-11-28

申请号：US16421406

申请日：2019-05-23

Applicant: Google LLC

Inventor： Mohammad Norouzi , William Chan , Sara Sabour Rouh Aghdam

IPC: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a sequence generation neural network. One of the methods includes obtaining a batch of training examples; for each of the training examples: processing the training network input in the training example using the neural network to generate an output sequence; for each particular output position in the output sequence: identifying a prefix that includes the system outputs at positions before the particular output position in the output sequence, for each possible system output in the vocabulary, determining a highest quality score that can be assigned to any candidate output sequence that includes the prefix followed by the possible system output, and determining an update to the current values of the network parameters that increases a likelihood that the neural network generates a system output at the position that has a high quality score.

36.

发明申请
Augmentation of Audiographic Images for Improved Machine Learning 审中-公开

公开(公告)号：US20190354808A1

公开(公告)日：2019-11-21

申请号：US16416888

申请日：2019-05-20

Applicant: Google LLC

Inventor： Daniel Sung-Joon Park , Quoc Le , William Chan , Ekin Dogus Cubuk , Barret Zoph , Yu Zhang , Chung-Cheng Chiu

IPC: G06K9/62 , G10L15/16 , G06N20/00 , G10L15/06 , G10L15/12 , G10L15/28

Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.

37.

发明授权
Speech recognition with attention-based recurrent neural networks 有权

公开(公告)号：US12100391B2

公开(公告)日：2024-09-24

申请号：US17450235

申请日：2021-10-07

Applicant: Google LLC

Inventor： William Chan , Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Noam M. Shazeer

IPC: G10L15/16 , G06F40/12 , G06F40/197 , G06N3/044 , G06N3/045 , G10L15/183 , G10L15/26 , G10L25/30

CPC classification number: G10L15/16 , G06F40/12 , G06F40/197 , G06N3/044 , G06N3/045 , G10L15/183 , G10L15/26 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

38.

发明授权
Generating neural network outputs using insertion commands 有权

公开(公告)号：US12086715B2

公开(公告)日：2024-09-10

申请号：US18321696

申请日：2023-05-22

Applicant: Google LLC

Inventor： William Chan , Mitchell Thomas Stern , Nikita Kitaev , Kelvin Gu , Jakob D. Uszkoreit

IPC: G06F40/30 , G06F40/237 , G06N3/04 , G06N3/08 , G06N3/084

CPC classification number: G06N3/08 , G06F40/237 , G06N3/04 , G06N3/084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing sequence modeling tasks using insertions. One of the methods includes receiving a system input that includes one or more source elements from a source sequence and zero or more target elements from a target sequence, wherein each source element is selected from a vocabulary of source elements and wherein each target element is selected from a vocabulary of target elements; generating a partial concatenated sequence that includes the one or more source elements from the source sequence and the zero or more target elements from the target sequence, wherein the source and target elements arranged in the partial concatenated sequence according to a combined order; and generating a final concatenated sequence that includes a finalized source sequence and a finalized target sequence, wherein the finalized target sequence includes one or more target elements.

39.

发明授权
Augmentation of audiographic images for improved machine learning 有权

公开(公告)号：US11816577B2

公开(公告)日：2023-11-14

申请号：US17487548

申请日：2021-09-28

Applicant: Google LLC

Inventor： Daniel Sung-Joon Park , Quoc Le , William Chan , Ekin Dogus Cubuk , Barret Zoph , Yu Zhang , Chung-Cheng Chiu

IPC: G10L15/06 , G10L15/12 , G06N3/084 , G10L15/16 , G10L15/28 , G06N20/00 , G06F18/214 , G06V10/774 , G06V10/82

CPC classification number: G06N3/084 , G06F18/2148 , G06N20/00 , G06V10/7747 , G06V10/82 , G10L15/063 , G10L15/12 , G10L15/16 , G10L15/28

Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.

40.

发明公开
END-TO-END SPEECH WAVEFORM GENERATION THROUGH DATA DENSITY GRADIENT ESTIMATION 审中-公开

公开(公告)号：US20230252974A1

公开(公告)日：2023-08-10

申请号：US18010438

申请日：2021-09-02

Applicant: Google LLC

Inventor： Byungha Chun , Mohammad Norouzi , Nanxin Chen , Ron J. Weiss , William Chan , Yu Zhang , Yonghui Wu

IPC: G10L13/08 , G10L21/0208

CPC classification number: G10L13/08 , G10L21/0208

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating waveforms conditioned on phoneme sequences. In one aspect, a method comprises: obtaining a phoneme sequence; processing the phoneme sequence using an encoder neural network to generate a hidden representation of the phoneme sequence; generating, from the hidden representation, a conditioning input; initializing a current waveform output; and generating a final waveform output that defines an utterance of the phoneme sequence by a speaker by updating the current waveform output at each of a plurality of iterations, wherein each iteration corresponds to a respective noise level, and wherein the updating comprises, at each iteration: processing (i) the current waveform output and (ii) the conditioning input using a noise estimation neural network to generate a noise output; and updating the current waveform output using the noise output and the noise level for the iteration.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification