Patent search ap:("Google LLC") AND inv:"Qian Chen" Page 1

1.

发明申请
Automatic Speech Recognition Accuracy With Multimodal Embeddings Search 有权

公开(公告)号：US20250006217A1

公开(公告)日：2025-01-02

申请号：US18344007

申请日：2023-06-29

Applicant: Google LLC

Inventor： Christopher Li , Kyle Scott Kastner , Yuan Wang , Zhehuai Chen , Andrew Maxwell Rosenberg , Heng Su , Qian Chen , Leonid Aleksandrovich Velikovich , Patrick Maxim Rondon , Diamantino Antonio Caseiro , Zelin Wu

IPC: G10L25/30 , G10L15/26

Abstract: A method includes receiving training data that includes a set of transcribed speech utterances where each respective transcribed speech utterance is paired with a corresponding transcription. For each respective transcribed speech utterance, the method includes generating an encoded audio representation and an encoded textual representation, generating a higher order audio feature representation for a corresponding encoded audio representation, generating a higher order textual feature representation for a corresponding encoded textual representation, and determining a loss for the respective transcribed speech utterance based on the higher order audio feature representation and the higher order textual feature representation. The method also includes training a speech encoder and a text encoder of a correction model based on the loss determined for each transcribed speech utterance of the set of transcribed speech utterances.

Patent Agency Ranking