-
公开(公告)号:US12148421B2
公开(公告)日:2024-11-19
申请号:US17532794
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Vikram Sridar , Daniel De Freitas Adiwardana , Noam M. Shazeer , Quoc Le
IPC: G10L15/22 , G06F16/9032 , G10L15/183
Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.
-
公开(公告)号:US20240362093A1
公开(公告)日:2024-10-31
申请号:US18231606
申请日:2023-08-08
Applicant: GOOGLE LLC
Inventor: Hao Zhou , Jamie Hall , Xinying Song , Sahitya Potluri , Yu Du , Heng-Tze Cheng , Quoc Le , Ed H. Chi
IPC: G06F9/54 , G06F16/242
CPC classification number: G06F9/547 , G06F16/243
Abstract: At least utilizing a custom corpus of documents to condition a large language model (LLM) when generating a response to a user query. In some implementations, a user query associated with a client device is received. An API query for an external application is generated by an LLM based on the user query. The external application has access to a custom corpus of documents comprising a plurality of documents. The external application is queried using the API query. Data representative of one or more documents in the custom corpus of documents is received from the external application in response to the API query. The LLM generates a response to the query that is conditioned on the data representing one or more of the documents in the custom corpus of documents received from the external application. The response to the user query is caused to be rendered on the client device.
-
公开(公告)号:US20240311577A1
公开(公告)日:2024-09-19
申请号:US18364355
申请日:2023-08-02
Applicant: GOOGLE LLC
Inventor: Anoop K. Sinha , Quoc Le , Jason S. Spielman
IPC: G06F40/35
CPC classification number: G06F40/35
Abstract: Techniques are described herein for personalized multi-response dialog generated using one or more large language models. A method includes: receiving first natural language (NL) based input associated with a client device; generating, based on the first NL based input and using at least one large language model (LLM), one or more instances of first LLM output; determining, based on the one or more instances of first LLM output, at least three responses to the first NL based input; determining, based on at least one scoring criterion, respective scores of the at least three responses to the first NL based input; selecting, based on the respective scores of the at least three responses to the first NL based input, from the at least three responses to the first NL based input, a first subset, the first subset comprising at least two responses to the first NL based input; and causing each of the at least two responses in the first subset to be rendered at the client device.
-
公开(公告)号:US11928574B2
公开(公告)日:2024-03-12
申请号:US18154321
申请日:2023-01-13
Applicant: Google LLC
Inventor: Mingxing Tan , Quoc Le , Bo Chen , Vijay Vasudevan , Ruoming Pang
Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
-
-
-