Patent search ap:("Google LLC") AND inv:"Brian Farris" Page 1

1.

发明公开
Modular Training for Flexible Attention Based End-to-End ASR 审中-公开

公开(公告)号：US20240185839A1

公开(公告)日：2024-06-06

申请号：US18526148

申请日：2023-12-01

Applicant: Google LLC

Inventor： Kartik AUDHKHASI , Bhuvana Ramabhadran , Brian Farris

IPC: G10L15/06

CPC classification number: G10L15/063 , G10L2015/0635

Abstract: A method for training a modular neural network model includes training only a backbone model to provide a first model configuration of the modular neural network model. The first model configuration includes only the trained backbone model. The method also includes adding an intrinsic sub-model to the trained backbone model. During a fine-tuning training stage, the method includes freezing parameters of the trained backbone model and fine-tuning parameters of the intrinsic sub-model added to the trained backbone model while the parameters of the trained backbone model are frozen to provide a second model configuration that includes the backbone model initially trained during the initial training stage and the intrinsic sub-model having the parameters fine-tuned during the fine-tuning stage.

2.

发明申请
Self-Adaptive Distillation 有权

公开(公告)号：US20220309340A1

公开(公告)日：2022-09-29

申请号：US17544570

申请日：2021-12-07

Applicant: Google LLC

Inventor： Isabel Leal , Neeraj Gaur , Parisa Haghani , Brian Farris , Bhuvana Ramabhadran , Manasa Prasad , Pedro J. Moreno Mengibar , Yun Zhu

IPC: G06N3/08 , G06N3/04 , G10L15/06

Abstract: A method for distilling one or more trained teacher automatic speech recognition (ASR) models into a multilingual student model includes receiving a plurality of teacher training examples and a plurality of student training examples. The method also includes training one or more teacher automatic speech recognition (ASR) models using the plurality of teacher training examples. Each teacher ASR model is configured to output a respective textual representation of a respective audio input. The method further includes generating a multi-lingual student ASR model by training the multi-lingual student ASR model using the plurality of student training examples and distilling the trained one or more teacher ASR models into the multilingual student ASR model using a tunable distillation loss weight. Each student ASR model is configured to receive an audio input and output a corresponding textual representation of the received audio input.

Patent Agency Ranking