-
公开(公告)号:US12020703B2
公开(公告)日:2024-06-25
申请号:US17532819
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Trevor Strohman , Jonathan Bloom , Johan Schalkwyk , Joseph Smarr
CPC classification number: G10L15/22 , G06N20/00 , G08B5/36 , G10L15/18 , G10L2015/088 , G10L2015/223
Abstract: As part of a dialog session between a user and an automated assistant, implementations can process, using a streaming ASR model, a stream of audio data that captures a portion of a spoken utterance to generate ASR output, process, using an NLU model, the ASR output to generate NLU output, and cause, based on the NLU output, a stream of fulfillment data to be generated. Further, implementations can further determine, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance captured in the stream of audio data. Based on the audio-based characteristics and/the stream of NLU output, implementations can determine whether the user has paused in providing the spoken utterance or has completed providing of the spoken utterance. If the user has paused, implementations can cause natural conversation output to be provided for presentation to the user.
-
公开(公告)号:US20240312460A1
公开(公告)日:2024-09-19
申请号:US18674479
申请日:2024-05-24
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Trevor Strohman , Jonathan Bloom , Johan Schalkwyk , Joseph Smarr
CPC classification number: G10L15/22 , G06N20/00 , G08B5/36 , G10L15/18 , G10L2015/088 , G10L2015/223
Abstract: As part of a dialog session between a user and an automated assistant, implementations can process, using a streaming ASR model, a stream of audio data that captures a portion of a spoken utterance to generate ASR output, process, using an NLU model, the ASR output to generate NLU output, and cause, based on the NLU output, a stream of fulfillment data to be generated. Further, implementations can further determine, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance captured in the stream of audio data. Based on the audio-based characteristics and/the stream of NLU output, implementations can determine whether the user has paused in providing the spoken utterance or has completed providing of the spoken utterance. If the user has paused, implementations can cause natural conversation output to be provided for presentation to the user.
-
公开(公告)号:US20230053341A1
公开(公告)日:2023-02-23
申请号:US17532819
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Trevor Strohman , Jonathan Bloom , Johan Schalkwyk , Joseph Smarr
Abstract: As part of a dialog session between a user and an automated assistant, implementations can process, using a streaming ASR model, a stream of audio data that captures a portion of a spoken utterance to generate ASR output, process, using an NLU model, the ASR output to generate NLU output, and cause, based on the NLU output, a stream of fulfillment data to be generated. Further, implementations can further determine, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance captured in the stream of audio data. Based on the audio-based characteristics and/the stream of NLU output, implementations can determine whether the user has paused in providing the spoken utterance or has completed providing of the spoken utterance. If the user has paused, implementations can cause natural conversation output to be provided for presentation to the user.
-
-