-
公开(公告)号:US20250118293A1
公开(公告)日:2025-04-10
申请号:US18891615
申请日:2024-09-20
Applicant: Google LLC
Inventor: Mingqing Chen , Rajiv Mathews , Andrew Hard , Swaroop Ramaswamy , Kilol Gupta
Abstract: A method includes receiving a conversational training dataset including a plurality of conversational training samples, each training sample associated with a corresponding conversation and including: corresponding audio data characterizing a corresponding current utterance spoken by a user during a current turn in the corresponding conversation; a corresponding context for the corresponding current utterance including a transcript of a previous turn in the corresponding conversation that precedes the current turn; a corresponding ground-truth transcription of the corresponding current utterance; and a CoT annotation representing a corresponding logical relationship between the corresponding current utterance and the previous turn. The method also includes, for each corresponding conversational training sample in the conversational training dataset, training a speech model on the corresponding conversational training sample to teach the speech model to learn how to predict the corresponding logical relationship from the corresponding audio data and the corresponding context.