Patent search ap:("Google LLC") AND inv:"Navdeep Jaitly" Page 1

1.

发明申请
SEQUENCE MODELING USING IMPUTATION 有权

公开(公告)号：US20230075716A1

公开(公告)日：2023-03-09

申请号：US17797872

申请日：2021-02-08

Applicant: Google LLC

Inventor： William Chan , Chitwan Saharia , Geoffrey E. Hinton , Mohammad Norouzi , Navdeep Jaitly

IPC: G06F40/47 , G06F40/284

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for sequence modeling. One of the methods includes receiving an input sequence having a plurality of input positions; determining a plurality of blocks of consecutive input positions; processing the input sequence using a neural network to generate a latent alignment, comprising, at each of a plurality of input time steps: receiving a partial latent alignment from a previous input time step; selecting an input position in each block, wherein the token at the selected input position of the partial latent alignment in each block is a mask token; and processing the partial latent alignment and the input sequence using the neural network to generate a new latent alignment, wherein the new latent alignment comprises, at the selected input position in each block, an output token or a blank token; and generating, using the latent alignment, an output sequence.

2.

发明申请
SPEECH RECOGNITION WITH ATTENTION-BASED RECURRENT NEURAL NETWORKS 有权

公开(公告)号：US20220028375A1

公开(公告)日：2022-01-27

申请号：US17450235

申请日：2021-10-07

Applicant: Google LLC

Inventor： William Chan , Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Noam M. Shazeer

IPC: G10L15/16 , G06N3/04 , G06F40/12 , G06F40/197 , G10L15/183 , G10L15/26

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

3.

发明申请
SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS 有权

公开(公告)号：US20220005465A1

公开(公告)日：2022-01-06

申请号：US17448119

申请日：2021-09-20

Applicant: Google LLC

Inventor： Rohit Prakash Prabhavalkar , Zhifeng Chen , Bo Li , Chung-cheng Chiu , Kanury Kanishka Rao , Yonghui Wu , Ron J. Weiss , Navdeep Jaitly , Michiel A.u. Bacchiani , Tara N. Sainath , Jan Kazimierz Chorowski , Anjuli Patricia Kannan , Ekaterina Gonina , Patrick An Phu Nguyen

IPC: G10L15/16 , G10L15/22 , G10L15/02 , G06N3/08 , G10L15/06 , G10L25/30

Abstract: A method for performing speech recognition using sequence-to-sequence models includes receiving audio data for an utterance and providing features indicative of acoustic characteristics of the utterance as input to an encoder. The method also includes processing an output of the encoder using an attender to generate a context vector, generating speech recognition scores using the context vector and a decoder trained using a training process, and generating a transcription for the utterance using word elements selected based on the speech recognition scores. The transcription is provided as an output of the ASR system.

4.

发明授权
Very deep convolutional neural networks for end-to-end speech recognition 有权

公开(公告)号：US11080599B2

公开(公告)日：2021-08-03

申请号：US16692538

申请日：2019-11-22

Applicant: Google LLC

Inventor： Navdeep Jaitly , Yu Zhang , William Chan

IPC: G06N3/08 , G10L15/16 , G06N3/04 , G10L15/02 , G10L15/22

Abstract: A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of sub string scores that includes a respective sub string score for each substring in a set of substrings.

5.

发明授权
Recurrent neural networks for online sequence generation 有权

公开(公告)号：US10656605B1

公开(公告)日：2020-05-19

申请号：US16401791

申请日：2019-05-02

Applicant: Google LLC

Inventor： Chung-Cheng Chiu , Navdeep Jaitly , Ilya Sutskever , Yuping Luo

IPC: G05B13/02 , G10L15/16 , G06N3/04 , G06F40/44

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from a source sequence. In one aspect, the system includes a recurrent neural network configured to, at each time step, receive am input for the time step and process the input to generate a progress score and a set of output scores; and a subsystem configured to, at each time step, generate the recurrent neural network input and provide the input to the recurrent neural network; determine, from the progress score, whether or not to emit a new output at the time step; and, in response to determining to emit a new output, select an output using the output scores and emit the selected output as the output at a next position in the output order.

6.

发明公开
GENERATING STRUCTURED TEXT CONTENT USING SPEECH RECOGNITION MODELS 审中-公开

公开(公告)号：US20230386652A1

公开(公告)日：2023-11-30

申请号：US18234350

申请日：2023-08-15

Applicant: Google LLC

Inventor： Christopher S. Co , Navdeep Jaitly , Lily Hao Yi Peng , Katherine Irene Chou , Ananth Sankar

IPC: G16H40/20 , G10L15/26 , G10L15/18 , G06F40/47 , G06F40/58 , G10L15/06 , G10L15/14 , G10L15/16 , G10L15/183

CPC classification number: G16H40/20 , G10L15/26 , G10L15/1822 , G06F40/47 , G06F40/58 , G10L15/063 , G10L15/142 , G10L15/16 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing one or more utterances; processing the input acoustic sequence using a speech recognition model to generate a transcription of the input acoustic sequence, wherein the speech recognition model comprises a domain-specific language model; and providing the generated transcription of the input acoustic sequence as input to a domain-specific predictive model to generate structured text content that is derived from the transcription of the input acoustic sequence.

7.

发明申请
END-TO-END TEXT-TO-SPEECH CONVERSION 有权

公开(公告)号：US20210366463A1

公开(公告)日：2021-11-25

申请号：US17391799

申请日：2021-08-02

Applicant: Google LLC

Inventor： Samuel Bengio , Yuxuan Wang , Zongheng Yang , Zhifeng Chen , Yonghui Wu , Ioannis Agiomyrgiannakis , Ron J. Weiss , Navdeep Jaitly , Ryan M. Rifkin , Robert Andrew James Clark , Quoc V. Le , Russell J. Ryan , Ying Xiao

IPC: G10L13/08 , G06N3/08 , G10L25/18 , G10L25/30 , G10L13/04 , G06N3/04 , G10L15/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

8.

发明授权
Speech recognition with attention-based recurrent neural networks 有权

公开(公告)号：US11151985B2

公开(公告)日：2021-10-19

申请号：US16713298

申请日：2019-12-13

Applicant: Google LLC

Inventor： William Chan , Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Noam M. Shazeer

IPC: G10L15/16 , G06N3/04 , G06F40/12 , G06F40/197 , G10L15/183 , G10L15/26 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps, processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence, processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

9.

发明授权
Generating structured text content using speech recognition models 有权

公开(公告)号：US10860685B2

公开(公告)日：2020-12-08

申请号：US15362643

申请日：2016-11-28

Applicant: Google LLC

Inventor： Christopher S. Co , Navdeep Jaitly , Lily Hao Yi Peng , Katherine Irene Chou , Ananth Sankar

IPC: G10L15/26 , G06F19/00 , G10L15/18 , G06F40/47 , G06F40/58 , G10L15/06 , G10L15/14 , G10L15/16 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing one or more utterances; processing the input acoustic sequence using a speech recognition model to generate a transcription of the input acoustic sequence, wherein the speech recognition model comprises a domain-specific language model; and providing the generated transcription of the input acoustic sequence as input to a domain-specific predictive model to generate structured text content that is derived from the transcription of the input acoustic sequence.

10.

发明申请
END-TO-END TEXT-TO-SPEECH CONVERSION 审中-公开

公开(公告)号：US20200098350A1

公开(公告)日：2020-03-26

申请号：US16696101

申请日：2019-11-26

Applicant: Google LLC

Inventor： Samuel Bengio , Yuxuan Wang , Zongheng Yang , Zhifeng Chen , Yonghui Wu , Ioannis Agiomyrgiannakis , Ron J. Weiss , Navdeep Jaitly , Ryan M. Rifkin , Robert Andrew James Clark , Quoc V. Le , Russell J. Ryan , Ying Xiao

IPC: G10L13/08 , G10L15/16 , G06N3/08 , G06N3/04 , G10L13/04 , G10L25/30 , G10L25/18

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification