-
公开(公告)号:US20250095637A1
公开(公告)日:2025-03-20
申请号:US18886581
申请日:2024-09-16
Applicant: Google LLC
Inventor: Ke Hu , Tara N. Sainath , Bo Li , Yu Zhang , Yong Cheng , Tao Wang , Yujing Zhang , Frederick Liu
Abstract: A method includes receiving a textual prompt in a first language and obtaining a fine-tuned prompt embedding configured to guide a large language model (LLM) to generate text in a target language from textual prompts in the first language. The method also includes processing, using the LLM, the textual prompt conditioned on the fine-tuned prompt embedding to generate output text in the target language and concatenating the textual prompt and the generated output text to provide an unspoken textual utterance. The method also includes training a multilingual automatic speech recognition (ASR) model to learn how to recognize speech in the target language by injecting the unspoken textual utterance into a text encoder associated with the multilingual ASR model.
-
公开(公告)号:US20230112862A1
公开(公告)日:2023-04-13
申请号:US17960380
申请日:2022-10-05
Applicant: Google LLC
Inventor: Venkata S. Bhojanapalli , Andreas Veit , Ayan Chakrabarti , Frederick Liu , Himanshu Jain , Michal Lukasik , Sanjiv Kumar , Yin-Wen Chang
IPC: G06N3/04
Abstract: Provided are systems and methods that improve the computational efficiency of Transformers or other attention-based neural networks or machine learning models by re-using a number of attention scores between layers and/or heads of the model. To reduce the computational cost of self-attention-based models while achieving comparable or even superior results, example aspects of the present disclosure propose a novel architecture that reuses attention scores computed in one layer in one or multiple subsequent layers.
-