Joint Speech and Text Streaming Model for ASR

    公开(公告)号:US20240028829A1

    公开(公告)日:2024-01-25

    申请号:US18346232

    申请日:2023-07-01

    Applicant: Google LLC

    CPC classification number: G06F40/284 G06F40/40

    Abstract: A method includes receiving training data that includes a set of unspoken textual utterances. For each respective unspoken textual utterance, the method includes, tokenizing the respective textual utterance into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit tokenized from the respective unspoken textual utterance, receiving the first higher order textual feature representation generated by a text encoder, and generating a first probability distribution over possible text units. The method also includes training an encoder based on the first probability distribution over possible text units generated by a first-pass decoder for each respective unspoken textual utterance in the set of unspoken textual utterances.

Patent Agency Ranking