USING TEXT-INJECTION TO RECOGNIZE SPEECH WITHOUT TRANSCRIPTION

    公开(公告)号:US20240304178A1

    公开(公告)日:2024-09-12

    申请号:US18439630

    申请日:2024-02-12

    Applicant: Google LLC

    CPC classification number: G10L15/063 G10L15/22 G10L15/26

    Abstract: A method includes receiving training data including transcribed speech utterances spoken in a general domain, modified speech utterances in a target domain, and unspoken textual utterances corresponding to the transcriptions of the modified speech utterances in the target domain. The modified speech utterances include utterances spoken in the target domain that have been modified to obfuscate one or more classes of sensitive information recited in the utterances. The method also includes generating a corresponding alignment output for each unspoken textual utterance of the received training data using an alignment model. The method also includes training a speech recognition model on the alignment outputs generated for the corresponding to the unspoken textual utterances, the un-transcribed speech utterances, and the transcribed speech utterances to teach the speech recognition model to learn to recognize speech in the target domain and phrases within the one or more classes of sensitive information.

Patent Agency Ranking