ENHANCED ATTENTION MECHANISMS
    22.
    发明申请

    公开(公告)号:US20200026760A1

    公开(公告)日:2020-01-23

    申请号:US16518518

    申请日:2019-07-22

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for enhanced attention mechanisms. In some implementations, data indicating an input sequence is received. The data is processed using an encoder neural network to generate a sequence of encodings. A series of attention outputs is determined using one or more attender modules. Determining each attention output can include (i) selecting an encoding from the sequence of encodings and (ii) determining attention over a proper subset of the sequence of encodings, where the proper subset of encodings is determined based on a position of the selected encoding in the sequence of encodings. The selections of encodings are also monotonic through the sequence of encodings. An output sequence is generated by processing the attention outputs using a decoder neural network. An output is provided that indicates a language sequence determined from the output sequence.

    Recurrent neural networks for online sequence generation

    公开(公告)号:US10281885B1

    公开(公告)日:2019-05-07

    申请号:US15600699

    申请日:2017-05-19

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from a source sequence. In one aspect, the system includes a recurrent neural network configured to, at each time step, receive am input for the time step and process the input to generate a progress score and a set of output scores; and a subsystem configured to, at each time step, generate the recurrent neural network input and provide the input to the recurrent neural network; determine, from the progress score, whether or not to emit a new output at the time step; and, in response to determining to emit a new output, select an output using the output scores and emit the selected output as the output at a next position in the output order.

    Enhanced attention mechanisms
    24.
    发明授权

    公开(公告)号:US12175202B2

    公开(公告)日:2024-12-24

    申请号:US17456958

    申请日:2021-11-30

    Applicant: Google LLC

    Abstract: A method includes receiving a sequence of audio features characterizing an utterance and processing, using an encoder neural network, the sequence of audio features to generate a sequence of encodings. At each of a plurality of output steps, the method also includes determining a corresponding hard monotonic attention output to select an encoding from the sequence of encodings, identifying a proper subset of the sequence of encodings based on a position of the selected encoding in the sequence of encodings, and performing soft attention over the proper subset of the sequence of encodings to generate a context vector at the corresponding output step. The method also includes processing, using a decoder neural network, the context vector generated at the corresponding output step to predict a probability distribution over possible output labels at the corresponding output step.

    Cascaded Encoders for Simplified Streaming and Non-Streaming ASR

    公开(公告)号:US20220122622A1

    公开(公告)日:2022-04-21

    申请号:US17237021

    申请日:2021-04-21

    Applicant: Google LLC

    Abstract: An automated speech recognition (ASR) model includes a first encoder, a second encoder, and a decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The second encoder receives, as input, the first higher order feature representation generated by the first encoder at each of the plurality of output steps, and generates, at each of the plurality of output steps, a second higher order feature representation for a corresponding first higher order feature frame. The decoder receives, as input, the second higher order feature representation generated by the second encoder at each of the plurality of output steps, and generates, at each of the plurality of time steps, a first probability distribution over possible speech recognition hypotheses.

    ENHANCED ATTENTION MECHANISMS
    30.
    发明申请

    公开(公告)号:US20220083743A1

    公开(公告)日:2022-03-17

    申请号:US17456958

    申请日:2021-11-30

    Applicant: Google LLC

    Abstract: A method includes receiving a sequence of audio features characterizing an utterance and processing, using an encoder neural network, the sequence of audio features to generate a sequence of encodings. At each of a plurality of output steps, the method also includes determining a corresponding hard monotonic attention output to select an encoding from the sequence of encodings, identifying a proper subset of the sequence of encodings based on a position of the selected encoding in the sequence of encodings, and performing soft attention over the proper subset of the sequence of encodings to generate a context vector at the corresponding output step. The method also includes processing, using a decoder neural network, the context vector generated at the corresponding output step to predict a probability distribution over possible output labels at the corresponding output step.

Patent Agency Ranking