Patent search ap:("Google LLC") AND inv:"Shuo-yiin Chang" Page 3

21.

发明公开
Intended Query Detection using E2E Modeling for continued Conversation 审中-公开

公开(公告)号：US20230335117A1

公开(公告)日：2023-10-19

申请号：US18186872

申请日：2023-03-20

Applicant: Google LLC

Inventor： Shuo-yiin Chang , Guru Prakash Arumugam , Zelin Wu , Tara N. Sainath , Bo LI , Qiao Liang , Adam Stambler , Shyam Upadhyay , Manaal Faruqui , Trevor Strohman

IPC: G10L15/16 , G10L15/22 , G10L15/06

CPC classification number: G10L15/16 , G10L15/22 , G10L15/063 , G10L2015/223

Abstract: A method includes receiving, as input to a speech recognition model, audio data corresponding to a spoken utterance. The method also includes performing, using the speech recognition model, speech recognition on the audio data by, at each of a plurality of time steps, encoding, using an audio encoder, the audio data corresponding to the spoken utterance into a corresponding audio encoding, and decoding, using a speech recognition joint network, the corresponding audio encoding into a probability distribution over possible output labels. At each of the plurality of time steps, the method also includes determining, using an intended query (IQ) joint network configured to receive a label history representation associated with a sequence of non-blank symbols output by a final softmax layer, an intended query decision indicating whether or not the spoken utterance includes a query intended for a digital assistant.

22.

发明授权
Unified endpointer using multitask and multidomain learning 有权

公开(公告)号：US10929754B2

公开(公告)日：2021-02-23

申请号：US16711172

申请日：2019-12-11

Applicant: Google LLC

Inventor： Shuo-yiin Chang , Bo Li , Gabor Simko , Maria Carolina Parada San Martin , Sean Matthew Shannon

IPC: G10L15/16 , G06N3/08 , G06N3/04 , G06N20/20 , G06K9/62 , G06N5/04

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification