Patent search ap:("Google LLC") AND inv:"Samuel Bengio" Page 2

11.

发明申请
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING 审中-公开

公开(公告)号：US20190115013A1

公开(公告)日：2019-04-18

申请号：US16171629

申请日：2018-10-26

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Michiel A.U. Bacchiani , Tara N. Sainath , Ehsan Variani , Izhak Shafran

IPC: G10L15/16 , G10L19/02 , G10L15/02 , G10H1/00

CPC classification number: G10L15/16 , G10H1/00 , G10H2210/036 , G10H2210/046 , G10H2250/235 , G10H2250/311 , G10L15/02 , G10L17/18 , G10L19/0212

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

12.

发明申请
Neural Networks For Speaker Verification 审中-公开

公开(公告)号：US20180315430A1

公开(公告)日：2018-11-01

申请号：US15966667

申请日：2018-04-30

Applicant: Google LLC

Inventor： Georg Heigold , Samuel Bengio , Ignacio Lopez Moreno

IPC: G10L17/18 , G10L17/02 , G10L17/04

Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

13.

发明授权
Neural networks for speaker verification 有权

公开(公告)号：US11961525B2

公开(公告)日：2024-04-16

申请号：US17444384

申请日：2021-08-03

Applicant: Google LLC

Inventor： Georg Heigold , Samuel Bengio , Ignacio Lopez Moreno

IPC: G10L17/18 , G10L17/02 , G10L17/04

CPC classification number: G10L17/18 , G10L17/02 , G10L17/04

Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.

14.

发明公开
DEVICE PLACEMENT OPTIMIZATION WITH REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20240062062A1

公开(公告)日：2024-02-22

申请号：US18376362

申请日：2023-10-03

Applicant: Google LLC

Inventor： Samuel Bengio , Mohammad Norouzi , Benoit Steiner , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le , Naveen Kumar , Yuefeng Zhou , Rasmus Munk Larsen

IPC: G06N3/08 , G06N5/04 , G06N3/10 , G06N3/044 , G06N3/045

CPC classification number: G06N3/08 , G06N5/04 , G06N3/105 , G06N3/044 , G06N3/045

Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.

15.

发明授权
End-to-end text-to-speech conversion 有权

公开(公告)号：US11862142B2

公开(公告)日：2024-01-02

申请号：US17391799

申请日：2021-08-02

Applicant: Google LLC

Inventor： Samuel Bengio , Yuxuan Wang , Zongheng Yang , Zhifeng Chen , Yonghui Wu , Ioannis Agiomyrgiannakis , Ron J. Weiss , Navdeep Jaitly , Ryan M. Rifkin , Robert Andrew James Clark , Quoc V. Le , Russell J. Ryan , Ying Xiao

IPC: G10L13/06 , G10L13/08 , G06N3/08 , G10L25/18 , G10L25/30 , G10L13/04 , G06N3/084 , G10L15/16 , G06N3/045

CPC classification number: G10L13/08 , G06N3/045 , G06N3/08 , G06N3/084 , G10L13/04 , G10L15/16 , G10L25/18 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

16.

发明申请
PROCESSING AND GENERATING SETS USING RECURRENT NEURAL NETWORKS 有权

公开(公告)号：US20220180151A1

公开(公告)日：2022-06-09

申请号：US17679625

申请日：2022-02-24

Applicant: Google LLC

Inventor： Oriol Vinyals , Samuel Bengio

IPC: G06N3/04

Abstract: In one aspect, this specification describes a recurrent neural network system implemented by one or more computers that is configured to process input sets to generate neural network outputs for each input set. The input set can be a collection of multiple inputs for which the recurrent neural network should generate the same neural network output regardless of the order in which the inputs are arranged in the collection. The recurrent neural network system can include a read neural network, a process neural network, and a write neural network. In another aspect, this specification describes a system implemented as computer programs on one or more computers in one or more locations that is configured to train a recurrent neural network that receives a neural network input and sequentially emits outputs to generate an output sequence for the neural network input.

17.

发明授权
Generating target sequences from input sequences using partial conditioning 有权

公开(公告)号：US11195521B2

公开(公告)日：2021-12-07

申请号：US16781273

申请日：2020-02-04

Applicant: Google LLC

Inventor： Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Samuel Bengio , Ilya Sutskever

IPC: G10L15/00 , G10L15/16 , G06N3/04 , G10L15/26 , G06F40/58 , G06F40/274 , G06F40/55 , G10L15/02 , G05B13/02

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

18.

发明申请
Generating Target Sequences From Input Sequences Using Partial Conditioning 审中-公开

公开(公告)号：US20200251099A1

公开(公告)日：2020-08-06

申请号：US16781273

申请日：2020-02-04

Applicant: Google LLC

Inventor： Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Samuel Bengio , Ilya Sutskever

IPC: G10L15/16 , G06F40/274 , G06F40/58 , G06F40/55 , G05B13/02 , G10L15/02 , G10L15/26 , G06N3/04

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

19.

发明授权
Generating target sequences from input sequences using partial conditioning 有权

公开(公告)号：US10559300B2

公开(公告)日：2020-02-11

申请号：US16055414

申请日：2018-08-06

Applicant: Google LLC

Inventor： Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Samuel Bengio , Ilya Sutskever

IPC: G10L15/00 , G10L15/16 , G06N3/04 , G06F17/27 , G10L15/26 , G06F17/28 , G10L15/02 , G05B13/02

Abstract: A system can be configured to perform tasks such as converting recorded speech to a sequence of phonemes that represent the speech, converting an input sequence of graphemes into a target sequence of phonemes, translating an input sequence of words in one language into a corresponding sequence of words in another language, or predicting a target sequence of words that follow an input sequence of words in a language (e.g., a language model). In a speech recognizer, the RNN system may be used to convert speech to a target sequence of phonemes in real-time so that a transcription of the speech can be generated and presented to a user, even before the user has completed uttering the entire speech input.

20.

发明授权
Complex linear projection for acoustic modeling 有权

公开(公告)号：US10140980B2

公开(公告)日：2018-11-27

申请号：US15386979

申请日：2016-12-21

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Michiel A. U. Bacchiani , Tara N. Sainath , Ehsan Variani , Izhak Shafran

IPC: G10L15/16 , G10L19/02 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification