Patent search ap:("Google LLC") AND inv:"Rami Magdi Fahmi Botros" Page 1

1.

发明公开
EXPORTING MODULAR ENCODER FEATURES FOR STREAMING AND DELIBERATION ASR 审中-公开

公开(公告)号：US20240144917A1

公开(公告)日：2024-05-02

申请号：US18494763

申请日：2023-10-25

Applicant: Google LLC

Inventor： Rami Magdi Fahmi Botros , Rohit Prakash Prabhavalkar , Johan Schalkwyk , Tara N. Sainath , Ciprian Ioan Chelba , Francoise Beaufays

IPC: G10L15/16

CPC classification number: G10L15/16

Abstract: A method includes obtaining a base encoder from a pre-trained model, and receiving training data comprising a sequence of acoustic frames characterizing an utterance paired with a ground-truth transcription of the utterance. At each of a plurality of output steps, the method includes: generating, by the base encoder, a first encoded representation for a corresponding acoustic frame; generating, by an exporter network configured to receive a continuous sequence of first encoded representations generated by the base encoder, a second encoded representation for a corresponding acoustic frame; generating, by an exporter decoder, a probability distribution over possible logits; and determining an exporter decoder loss based on the probability distribution over possible logits generated by the exporter decoder at the corresponding output step and the ground-truth transcription. The method also includes training the exporter network based on the exporter decoder losses while parameters of the base encoder are frozen.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification