NEURAL NETWORKS WITH SWITCH LAYERS
    61.
    发明申请

    公开(公告)号:US20250053815A1

    公开(公告)日:2025-02-13

    申请号:US18806647

    申请日:2024-08-15

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input to generate a network output. In one aspect, one of the systems includes a neural network configured to perform the machine learning task, the neural network including one or more switch layers.

    USING LARGE LANGUAGE MODEL(S) IN GENERATING AUTOMATED ASSISTANT RESPONSE(S)

    公开(公告)号:US20250037711A1

    公开(公告)日:2025-01-30

    申请号:US18912175

    申请日:2024-10-10

    Applicant: GOOGLE LLC

    Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.

    Using large language model(s) in generating automated assistant response(s

    公开(公告)号:US12148421B2

    公开(公告)日:2024-11-19

    申请号:US17532794

    申请日:2021-11-22

    Applicant: GOOGLE LLC

    Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.

    Mixture of experts neural networks
    64.
    发明授权

    公开(公告)号:US12067476B2

    公开(公告)日:2024-08-20

    申请号:US18244171

    申请日:2023-09-08

    Applicant: Google LLC

    CPC classification number: G06N3/045 G06N3/08

    Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.

    ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

    公开(公告)号:US20240256859A1

    公开(公告)日:2024-08-01

    申请号:US18403966

    申请日:2024-01-04

    Applicant: Google LLC

    CPC classification number: G06N3/08 G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    ATTENTION-BASED SEQUENCE TRANSDUCTION NEURAL NETWORKS

    公开(公告)号:US20240144006A1

    公开(公告)日:2024-05-02

    申请号:US18407299

    申请日:2024-01-08

    Applicant: Google LLC

    CPC classification number: G06N3/08 G06N3/04 G06N3/045 G06N20/00

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. In one aspect, one of the systems includes an encoder neural network configured to receive the input sequence and generate encoded representations of the network inputs, the encoder neural network comprising a sequence of one or more encoder subnetworks, each encoder subnetwork configured to receive a respective encoder subnetwork input for each of the input positions and to generate a respective subnetwork output for each of the input positions, and each encoder subnetwork comprising: an encoder self-attention sub-layer that is configured to receive the subnetwork input for each of the input positions and, for each particular input position in the input order: apply an attention mechanism over the encoder subnetwork inputs using one or more queries derived from the encoder subnetwork input at the particular input position.

    Attention-based decoder-only sequence transduction neural networks

    公开(公告)号:US11886998B2

    公开(公告)日:2024-01-30

    申请号:US18096946

    申请日:2023-01-13

    Applicant: Google LLC

    CPC classification number: G06N3/08 G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

    EVALUATING OUTPUT SEQUENCES USING AN AUTO-REGRESSIVE LANGUAGE MODEL NEURAL NETWORK

    公开(公告)号:US20230029590A1

    公开(公告)日:2023-02-02

    申请号:US17876451

    申请日:2022-07-28

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for evaluating candidate output sequences using language model neural networks. In particular, an auto-regressive language model neural network is used to generate a candidate output sequence. The same auto-regressive language model neural network is used to evaluate the candidate output sequence to determine rating scores for each of one or more criteria. The rating score(s) are then used to determine whether to provide the candidate output sequence.

Patent Agency Ranking