-
公开(公告)号:US20230335117A1
公开(公告)日:2023-10-19
申请号:US18186872
申请日:2023-03-20
Applicant: Google LLC
Inventor: Shuo-yiin Chang , Guru Prakash Arumugam , Zelin Wu , Tara N. Sainath , Bo LI , Qiao Liang , Adam Stambler , Shyam Upadhyay , Manaal Faruqui , Trevor Strohman
CPC classification number: G10L15/16 , G10L15/22 , G10L15/063 , G10L2015/223
Abstract: A method includes receiving, as input to a speech recognition model, audio data corresponding to a spoken utterance. The method also includes performing, using the speech recognition model, speech recognition on the audio data by, at each of a plurality of time steps, encoding, using an audio encoder, the audio data corresponding to the spoken utterance into a corresponding audio encoding, and decoding, using a speech recognition joint network, the corresponding audio encoding into a probability distribution over possible output labels. At each of the plurality of time steps, the method also includes determining, using an intended query (IQ) joint network configured to receive a label history representation associated with a sequence of non-blank symbols output by a final softmax layer, an intended query decision indicating whether or not the spoken utterance includes a query intended for a digital assistant.