Multilingual Re-Scoring Models for Automatic Speech Recognition

    公开(公告)号:US20220310081A1

    公开(公告)日:2022-09-29

    申请号:US17701635

    申请日:2022-03-22

    Applicant: Google LLC

    Abstract: A method includes receiving a sequence of acoustic frames extracted from audio data corresponding to an utterance. During a first pass, the method includes processing the sequence of acoustic frames to generate N candidate hypotheses for the utterance. During a second pass, and for each candidate hypothesis, the method includes generating a respective un-normalized likelihood score; generating a respective external language model score; generating a standalone score that models prior statistics of the corresponding candidate hypothesis, and generating a respective overall score for the candidate hypothesis based on the un-normalized likelihood score, the external language model score, and the standalone score. The method also includes selecting the candidate hypothesis having the highest respective overall score from among the N candidate hypotheses as a final transcription of the utterance.

    Multilingual Re-Scoring Models for Automatic Speech Recognition

    公开(公告)号:US20240420692A1

    公开(公告)日:2024-12-19

    申请号:US18818010

    申请日:2024-08-28

    Applicant: Google LLC

    Abstract: A method includes receiving a sequence of acoustic frames extracted from audio data corresponding to an utterance. During a first pass, the method includes processing the sequence of acoustic frames to generate N candidate hypotheses for the utterance. During a second pass, and for each candidate hypothesis, the method includes: generating a respective un-normalized likelihood score; generating a respective external language model score; generating a standalone score that models prior statistics of the corresponding candidate hypothesis; and generating a respective overall score for the candidate hypothesis based on the un-normalized likelihood score, the external language model score, and the standalone score. The method also includes selecting the candidate hypothesis having the highest respective overall score from among the N candidate hypotheses as a final transcription of the utterance.

    Multilingual re-scoring models for automatic speech recognition

    公开(公告)号:US12254875B2

    公开(公告)日:2025-03-18

    申请号:US18589220

    申请日:2024-02-27

    Applicant: Google LLC

    Abstract: A method includes receiving a sequence of acoustic frames extracted from audio data corresponding to an utterance. During a first pass, the method includes processing the sequence of acoustic frames to generate N candidate hypotheses for the utterance. During a second pass, and for each candidate hypothesis, the method includes: generating a respective un-normalized likelihood score; generating a respective external language model score; generating a standalone score that models prior statistics of the corresponding candidate hypothesis; and generating a respective overall score for the candidate hypothesis based on the un-normalized likelihood score, the external language model score, and the standalone score. The method also includes selecting the candidate hypothesis having the highest respective overall score from among the N candidate hypotheses as a final transcription of the utterance.

Patent Agency Ranking