Systems and Methods for Training Dual-Mode Machine-Learned Speech Recognition Models

    公开(公告)号:US20230237993A1

    公开(公告)日:2023-07-27

    申请号:US18011571

    申请日:2021-10-01

    Applicant: Google LLC

    CPC classification number: G10L15/16 G10L15/32 G10L15/22

    Abstract: Systems and methods of the present disclosure are directed to a computing system, including one or more processors and a machine-learned multi-mode speech recognition model configured to operate in a streaming recognition mode or a contextual recognition mode. The computing system can perform operations including obtaining speech data and a ground truth label and processing the speech data using the contextual recognition mode to obtain contextual prediction data. The operations can include evaluating a difference between the contextual prediction data and the ground truth label and processing the speech data using the streaming recognition mode to obtain streaming prediction data. The operations can include evaluating a difference between the streaming prediction data and the ground truth label and the contextual and streaming prediction data. The operations can include adjusting parameters of the speech recognition model.

Patent Agency Ranking