Patent search ap:("GOOGLE LLC") AND inv:"Gabor Simko" Page 3

21.

发明授权
Utterance classifier 有权

公开(公告)号：US11848018B2

公开(公告)日：2023-12-19

申请号：US17804657

申请日：2022-05-31

Applicant: Google LLC

Inventor： Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan

IPC: G10L15/08 , G10L15/22 , G06F3/16 , G10L15/16 , G10L15/18 , G10L15/30 , G10L17/00

CPC classification number: G10L15/22 , G06F3/167 , G10L15/16 , G10L15/18 , G10L15/30 , G10L17/00 , G10L2015/223 , G10L2015/227

Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.

22.

发明公开
DETECTING CONVERSATIONS WITH COMPUTING DEVICES 审中-公开

公开(公告)号：US20230274733A1

公开(公告)日：2023-08-31

申请号：US18144694

申请日：2023-05-08

Applicant: GOOGLE LLC

Inventor： Marcin Nowak-Przygodzki , Nathan David Howard , Gabor Simko , Andrei Giurgiu , Behshad Behzadi

IPC: G10L15/18 , G10L15/07 , G10L25/51 , G06F16/9032 , G10L15/08 , G10L15/22

CPC classification number: G10L15/1815 , G10L15/07 , G10L25/51 , G06F16/90332 , G10L15/08 , G10L15/22 , G10L2015/227 , G10L2015/223 , G10L2015/088

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting a continued conversation are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance. The actions further include obtaining a first transcription of the first utterance. The actions further include receiving second audio data of a second utterance. The actions further include obtaining a second transcription of the second utterance. The actions further include determining whether the second utterance includes a query directed to a query processing system based on analysis of the second transcription and the first transcription or a response to the first query. The actions further include configuring the data routing component to provide the second transcription of the second utterance to the query processing system as a second query or bypass routing the second transcription.

23.

发明授权
Unified endpointer using multitask and multidomain learning 有权

公开(公告)号：US11676625B2

公开(公告)日：2023-06-13

申请号：US17152918

申请日：2021-01-20

Applicant: Google LLC

Inventor： Shuo-Yiin Chang , Bo Li , Gabor Simko , Maria Carolina Parada San Martin , Sean Matthew Shannon

IPC: G10L15/16 , G10L25/78 , G06N3/08 , G06N20/20 , G06N5/046 , G06F18/214 , G06N3/045

CPC classification number: G10L25/78 , G06F18/214 , G06N3/045 , G06N3/08 , G06N5/046 , G06N20/20 , G10L15/16

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

24.

发明授权
Detecting conversations with computing devices 有权

公开(公告)号：US11676582B2

公开(公告)日：2023-06-13

申请号：US17117621

申请日：2020-12-10

Applicant: Google LLC

Inventor： Marcin Nowak-Przygodzki , Nathan David Howard , Gabor Simko , Andrei Giurgiu , Behshad Behzadi

IPC: G10L15/18 , G10L15/07 , G10L25/51 , G06F16/9032 , G10L15/08 , G10L15/22

CPC classification number: G10L15/1815 , G06F16/90332 , G10L15/07 , G10L15/08 , G10L15/22 , G10L25/51 , G10L2015/088 , G10L2015/223 , G10L2015/227

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting a continued conversation are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance. The actions further include obtaining a first transcription of the first utterance. The actions further include receiving second audio data of a second utterance. The actions further include obtaining a second transcription of the second utterance. The actions further include determining whether the second utterance includes a query directed to a query processing system based on analysis of the second transcription and the first transcription or a response to the first query. The actions further include configuring the data routing component to provide the second transcription of the second utterance to the query processing system as a second query or bypass routing the second transcription.

25.

发明授权
Utterance classifier 有权

公开(公告)号：US11361768B2

公开(公告)日：2022-06-14

申请号：US16935112

申请日：2020-07-21

Applicant: Google LLC

Inventor： Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan

IPC: G10L15/08 , G10L15/22 , G06F3/16 , G10L15/16 , G10L15/18 , G10L15/30 , G10L17/00

Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.

26.

发明申请
Unified Endpointer Using Multitask and Multidomain Learning 有权

公开(公告)号：US20210142174A1

公开(公告)日：2021-05-13

申请号：US17152918

申请日：2021-01-20

Applicant: Google LLC

Inventor： Shuo-yiin Chang , Bo Li , Gabor Simko , Maria Corolina Parada San Martin , Sean Matthew Shannon

IPC: G06N3/08 , G06N3/04 , G10L15/16 , G06N20/20 , G06K9/62 , G06N5/04

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

27.

发明申请
DETECTING CONVERSATIONS WITH COMPUTING DEVICES 有权

公开(公告)号：US20210097982A1

公开(公告)日：2021-04-01

申请号：US17117621

申请日：2020-12-10

Applicant: Google LLC

Inventor： Marcin Nowak-Przygodzki , Nathan David Howard , Gabor Simko , Andrei Giurgiu , Behshad Behzadi

IPC: G10L15/18 , G10L15/07 , G06F16/9032 , G10L25/51

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting a continued conversation are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance. The actions further include obtaining a first transcription of the first utterance. The actions further include receiving second audio data of a second utterance. The actions further include obtaining a second transcription of the second utterance. The actions further include determining whether the second utterance includes a query directed to a query processing system based on analysis of the second transcription and the first transcription or a response to the first query. The actions further include configuring the data routing component to provide the second transcription of the second utterance to the query processing system as a second query or bypass routing the second transcription.

28.

发明申请
UTTERANCE CLASSIFIER 审中-公开

公开(公告)号：US20200349946A1

公开(公告)日：2020-11-05

申请号：US16935112

申请日：2020-07-21

Applicant: Google LLC

Inventor： Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan

IPC: G10L15/22 , G06F3/16 , G10L15/16 , G10L15/18 , G10L15/30

Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification