BACKPLANE FOR AN ARRAY OF EMISSIVE ELEMENTS
    51.
    发明公开

    公开(公告)号:US20240221627A1

    公开(公告)日:2024-07-04

    申请号:US18544051

    申请日:2023-12-18

    Applicant: GOOGLE LLC

    CPC classification number: G09G3/32 G11C11/412 G09G2300/0842 G09G2310/0297

    Abstract: A backplane operative to drive an array of emissive pixel elements is disclosed. A plurality of pixel drive circuits form part of an array of emissive elements. The plurality of pixel drive circuits are disposed to form a plurality of rows and a plurality of columns. The plurality of pixel drive circuits are organized into sets of pixel drive circuits, and each set comprises at least one pixel drive circuit.

    TWO-PASS END TO END SPEECH RECOGNITION

    公开(公告)号:US20220238101A1

    公开(公告)日:2022-07-28

    申请号:US17616135

    申请日:2020-12-03

    Applicant: GOOGLE LLC

    Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.

    Fast Emit Low-latency Streaming ASR with Sequence-level Emission Regularization

    公开(公告)号:US20220122586A1

    公开(公告)日:2022-04-21

    申请号:US17447285

    申请日:2021-09-09

    Applicant: Google LLC

    Abstract: A computer-implemented method of training a streaming speech recognition model that includes receiving, as input to the streaming speech recognition model, a sequence of acoustic frames. The streaming speech recognition model is configured to learn an alignment probability between the sequence of acoustic frames and an output sequence of vocabulary tokens. The vocabulary tokens include a plurality of label tokens and a blank token. At each output step, the method includes determining a first probability of emitting one of the label tokens and determining a second probability of emitting the blank token. The method also includes generating the alignment probability at a sequence level based on the first probability and the second probability. The method also includes applying a tuning parameter to the alignment probability at the sequence level to maximize the first probability of emitting one of the label tokens.

    Multi-dialect and multilingual speech recognition

    公开(公告)号:US11238845B2

    公开(公告)日:2022-02-01

    申请号:US16684483

    申请日:2019-11-14

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.

    MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION

    公开(公告)号:US20200160836A1

    公开(公告)日:2020-05-21

    申请号:US16684483

    申请日:2019-11-14

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.

Patent Agency Ranking