Patent search ap:("Google LLC") AND inv:"Navdeep Jaitly" Page 2

11.

发明申请
SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS 审中-公开

公开(公告)号：US20200027444A1

公开(公告)日：2020-01-23

申请号：US16516390

申请日：2019-07-19

Applicant: Google LLC

Inventor： Rohit Prakash Prabhavalkar , Zhifeng Chen , Bo Li , Chung-Cheng Chiu , Kanury Kanishka Rao , Yonghui Wu , Ron J. Weiss , Navdeep Jaitly , Michiel A.U. Bacchiani , Tara N. Sainath , Jan Kazimierz Chorowski , Anjuli Patricia Kannan , Ekaterina Gonina , Patrick An Phu Nguyen

IPC: G10L15/16 , G10L15/22 , G10L15/06 , G10L15/02 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer-readable media, for performing speech recognition using sequence-to-sequence models. An automated speech recognition (ASR) system receives audio data for an utterance and provides features indicative of acoustic characteristics of the utterance as input to an encoder. The system processes an output of the encoder using an attender to generate a context vector and generates speech recognition scores using the context vector and a decoder trained using a training process that selects at least one input to the decoder with a predetermined probability. An input to the decoder during training is selected between input data based on a known value for an element in a training example, and input data based on an output of the decoder for the element in the training example. A transcription is generated for the utterance using word elements selected based on the speech recognition scores. The transcription is provided as an output of the ASR system.

12.

发明申请
END-TO-END TEXT-TO-SPEECH CONVERSION 审中-公开

公开(公告)号：US20190311708A1

公开(公告)日：2019-10-10

申请号：US16447862

申请日：2019-06-20

Applicant: Google LLC

Inventor： Samy Bengio , Yuxuan Wang , Zongheng Yang , Zhifeng Chen , Yonghui Wu , Ioannis Agiomyrgiannakis , Ron J. Weiss , Navdeep Jaitly , Ryan M. Rifkin , Robert Andrew James Clark , Quoc V. Le , Russell J. Ryan , Ying Xiao

IPC: G10L13/08 , G10L25/18 , G10L25/30 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

13.

发明授权
Recurrent neural networks for online sequence generation 有权

公开(公告)号：US10281885B1

公开(公告)日：2019-05-07

申请号：US15600699

申请日：2017-05-19

Applicant: Google LLC

Inventor： Chung-Cheng Chiu , Navdeep Jaitly , Ilya Sutskever , Yuping Luo

IPC: G05B13/02 , G06N3/04 , G06F17/28 , G10L15/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from a source sequence. In one aspect, the system includes a recurrent neural network configured to, at each time step, receive am input for the time step and process the input to generate a progress score and a set of output scores; and a subsystem configured to, at each time step, generate the recurrent neural network input and provide the input to the recurrent neural network; determine, from the progress score, whether or not to emit a new output at the time step; and, in response to determining to emit a new output, select an output using the output scores and emit the selected output as the output at a next position in the output order.

14.

发明申请
SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS 有权

公开(公告)号：US20240420686A1

公开(公告)日：2024-12-19

申请号：US18815200

申请日：2024-08-26

Applicant: Google LLC

Inventor： Rohit Prakash Prabhavalkar , Zhifeng Chen , Bo Li , Chung-Cheng Chiu , Kanury Kanishka Rao , Yonghui Wu , Ron J. Weiss , Navdeep Jaitly , Michiel A. U. Bacchiani , Tara N. Sainath , Jan Kazimierz Chorowski , Anjuli Patricia Kannan , Ekaterina Gonina , Patrick An Phu Nguyen

IPC: G10L15/16 , G06N3/08 , G10L15/02 , G10L15/06 , G10L15/22 , G10L15/26 , G10L25/30

Abstract: A method for performing speech recognition using sequence-to-sequence models includes receiving audio data for an utterance and providing features indicative of acoustic characteristics of the utterance as input to an encoder. The method also includes processing an output of the encoder using an attender to generate a context vector, generating speech recognition scores using the context vector and a decoder trained using a training process, and generating a transcription for the utterance using word elements selected based on the speech recognition scores. The transcription is provided as an output of the ASR system.

15.

发明授权
Training recurrent neural networks to generate sequences 有权

公开(公告)号：US11954594B1

公开(公告)日：2024-04-09

申请号：US17315695

申请日：2021-05-10

Applicant: Google LLC

Inventor： Samy Bengio , Oriol Vinyals , Navdeep Jaitly , Noam M. Shazeer

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: This document generally describes a neural network training system, including one or more computers, that trains a recurrent neural network (RNN) to receive an input, e.g., an input sequence, and to generate a sequence of outputs from the input sequence. In some implementations, training can include, for each position after an initial position in a training target sequence, selecting a preceding output of the RNN to provide as input to the RNN at the position, including determining whether to select as the preceding output (i) a true output in a preceding position in the output order or (ii) a value derived from an output of the RNN for the preceding position in an output order generated in accordance with current values of the parameters of the recurrent neural network.

16.

发明授权
End-to-end text-to-speech conversion 有权

公开(公告)号：US11862142B2

公开(公告)日：2024-01-02

申请号：US17391799

申请日：2021-08-02

Applicant: Google LLC

Inventor： Samuel Bengio , Yuxuan Wang , Zongheng Yang , Zhifeng Chen , Yonghui Wu , Ioannis Agiomyrgiannakis , Ron J. Weiss , Navdeep Jaitly , Ryan M. Rifkin , Robert Andrew James Clark , Quoc V. Le , Russell J. Ryan , Ying Xiao

IPC: G10L13/06 , G10L13/08 , G06N3/08 , G10L25/18 , G10L25/30 , G10L13/04 , G06N3/084 , G10L15/16 , G06N3/045

CPC classification number: G10L13/08 , G06N3/045 , G06N3/08 , G06N3/084 , G10L13/04 , G10L15/16 , G10L25/18 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

17.

发明授权
Generating structured text content using speech recognition models 有权

公开(公告)号：US11763936B2

公开(公告)日：2023-09-19

申请号：US17112279

申请日：2020-12-04

Applicant: Google LLC

Inventor： Christopher S. Co , Navdeep Jaitly , Lily Hao Yi Peng , Katherine Irene Chou , Ananth Sankar

IPC: G10L15/26 , G16H40/20 , G10L15/18 , G06F40/47 , G06F40/58 , G10L15/06 , G10L15/14 , G10L15/16 , G10L15/183

CPC classification number: G16H40/20 , G06F40/47 , G06F40/58 , G10L15/063 , G10L15/142 , G10L15/16 , G10L15/183 , G10L15/1822 , G10L15/26

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing one or more utterances; processing the input acoustic sequence using a speech recognition model to generate a transcription of the input acoustic sequence, wherein the speech recognition model comprises a domain-specific language model; and providing the generated transcription of the input acoustic sequence as input to a domain-specific predictive model to generate structured text content that is derived from the transcription of the input acoustic sequence.

18.

发明授权
Generating target sequences from input sequences using partial conditioning 有权

公开(公告)号：US11195521B2

公开(公告)日：2021-12-07

申请号：US16781273

申请日：2020-02-04

Applicant: Google LLC

Inventor： Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Samuel Bengio , Ilya Sutskever

IPC: G10L15/00 , G10L15/16 , G06N3/04 , G10L15/26 , G06F40/58 , G06F40/274 , G06F40/55 , G10L15/02 , G05B13/02

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

19.

发明申请
Generating Target Sequences From Input Sequences Using Partial Conditioning 审中-公开

公开(公告)号：US20200251099A1

公开(公告)日：2020-08-06

申请号：US16781273

申请日：2020-02-04

Applicant: Google LLC

Inventor： Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Samuel Bengio , Ilya Sutskever

IPC: G10L15/16 , G06F40/274 , G06F40/58 , G06F40/55 , G05B13/02 , G10L15/02 , G10L15/26 , G06N3/04

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

20.

发明申请
RECURRENT NEURAL NETWORKS FOR ONLINE SEQUENCE GENERATION 审中-公开

公开(公告)号：US20200151544A1

公开(公告)日：2020-05-14

申请号：US16610466

申请日：2018-05-03

Applicant: GOOGLE LLC

Inventor： Chung-Cheng Chiu , Navdeep Jaitly , John Dieterich Lawson , George Jay Tucker

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from a source sequence. In one aspect, the system includes a recurrent neural network configured to, at each time step, receive an input for the time step and process the input to generate a progress score and a set of output scores; and a subsystem configured to, at each time step, generate the recurrent neural network input and provide the input to the recurrent neural network; determine, from the progress score, whether or not to emit a new output at the time step; and, in response to determining to emit a new output, select an output using the output scores and emit the selected output as the output at a next position in the output order.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification