Patent search ap:("Google LLC") AND inv:"Tongzhou Chen" Page 2

11.

发明授权
Mixture model attention for flexible streaming and non-streaming automatic speech recognition 有权

公开(公告)号：US12014729B2

公开(公告)日：2024-06-18

申请号：US17644344

申请日：2021-12-15

Applicant: Google LLC

Inventor： Kartik Audhkhasi , Bhuvana Ramabhadran , Tongzhou Chen , Pedro J. Moreno Mengibar

IPC: G10L15/16 , G06F1/03 , G06N3/04 , G06N3/0455 , G10L19/16

CPC classification number: G10L15/16 , G06F1/03 , G06N3/04 , G06N3/0455 , G10L19/167

Abstract: A method for an automated speech recognition (ASR) model for unifying streaming and non-streaming speech recognition including receiving a sequence of acoustic frames. The method includes generating, using an audio encoder of an automatic speech recognition (ASR) model, a higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The method further includes generating, using a joint encoder of the ASR model, a probability distribution over possible speech recognition hypothesis at the corresponding time step based on the higher order feature representation generated by the audio encoder at the corresponding time step. The audio encoder comprises a neural network that applies mixture model (MiMo) attention to compute an attention probability distribution function (PDF) using a set of mixture components of softmaxes over a context window.

Patent Agency Ranking