Patent search ap:("Google LLC") AND inv:"Jaeyoung Kim" Page 2

11.

发明申请
Transformer Transducer: One Model Unifying Streaming And Non-Streaming Speech Recognition 有权

公开(公告)号：US20220108689A1

公开(公告)日：2022-04-07

申请号：US17210465

申请日：2021-03-23

Applicant: Google LLC

Inventor： Anshuman Tripathi , Hasim Sak , Han Lu , Qian Zhang , Jaeyoung Kim

IPC: G10L15/16 , G10L15/06 , G10L15/22 , G10L15/30 , G10L15/197 , G06N3/08 , G06N3/04

Abstract: A transformer-transducer model for unifying streaming and non-streaming speech recognition includes an audio encoder, a label encoder, and a joint network. The audio encoder receives a sequence of acoustic frames, and generates, at each of a plurality of time steps, a higher order feature representation for a corresponding acoustic frame. The label encoder receives a sequence of non-blank symbols output by a final softmax layer, and generates, at each of the plurality of time steps, a dense representation. The joint network receives the higher order feature representation and the dense representation at each of the plurality of time steps, and generates a probability distribution over possible speech recognition hypothesis. The audio encoder of the model further includes a neural network having an initial stack of transformer layers trained with zero look ahead audio context, and a final stack of transformer layers trained with a variable look ahead audio context.

Patent Agency Ranking