-
公开(公告)号:US20240203404A1
公开(公告)日:2024-06-20
申请号:US18081569
申请日:2022-12-14
Applicant: GOOGLE LLC
Inventor: Nir Shabat , Volodymyr Polosukhin , Shlomo Fruchter , Golan Pundak , Roy Atsmon
IPC: G10L15/18 , G10L13/027 , G10L15/26
CPC classification number: G10L15/1815 , G10L13/027 , G10L15/26
Abstract: In various implementations, a method implemented by one or more processors of a computing device can comprise receiving audio data that captures a spoken utterance of a user; processing the audio data using an automatic speech recognition (ASR) model to generate textual data corresponding to the spoken utterance; generating a semantic representation corresponding to the spoken utterance of the user based on applying both the audio data and the textual data as input across a large language model (LLM); and causing the semantic representation corresponding to the spoken utterance of the user to be utilized in fulfilling the spoken utterance.