-
公开(公告)号:US20220343152A1
公开(公告)日:2022-10-27
申请号:US17239320
申请日:2021-04-23
Applicant: Google LLC
Inventor: Bo Dai , Mengjiao Yang , Hanjun Dai , Dale Eric Schuurmans
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generative modelling of an exchangeable sets. Methods can include obtaining a dataset of training observations. Each training observation is an exchangeable set that includes a plurality of data points. Each training observations is processed using a first neural network to generate parameters of a first probability distribution based on which a latent variable is sampled. The latent variable is processed using a second neural network to generate a new observation that includes a plurality of data points. The training observation and the new observation is processed using an energy neural network to generate an estimate of an energy of the training observation and the new observation. The energy neural network is then trained to optimize an objective function that measures the difference between the estimate of the energy of the training observation and the new observation.
-
公开(公告)号:US11693637B1
公开(公告)日:2023-07-04
申请号:US17319739
申请日:2021-05-13
Applicant: Google LLC
Inventor: Rishabh Singh , Hanjun Dai , Manzil Zaheer , Artem Goncharuk , Karen Davis , David Andre
CPC classification number: G06F8/436 , G06F40/279 , G06F40/40 , G06N3/08 , G06N7/01
Abstract: Using a natural language (NL) latent presentation in the automated conversion of source code from a base programming language (e.g., C++) to a target programming language (e.g., Python). A base-to-NL model can be used to generate an NL latent representation by processing a base source code snippet in the base programming language. Further, an NL-to-target model can be used to generate a target source code snippet in the target programming language (that is functionally equivalent to the base source code snippet), by processing the NL latent representation. In some implementations, output(s) from the NL-to-target model indicate canonical representation(s) of variables, and in generating the target source code snippet, technique(s) are used to match those canonical representation(s) to variable(s) of the base source code snippet. In some implementations, multiple candidate target source code snippets are generated, and a subset (e.g., one) is selected based on evaluation(s).
-
公开(公告)号:US11960867B1
公开(公告)日:2024-04-16
申请号:US18198674
申请日:2023-05-17
Applicant: GOOGLE LLC
Inventor: Rishabh Singh , Hanjun Dai , Manzil Zaheer , Artem Goncharuk , Karen Davis , David Andre
CPC classification number: G06F8/436 , G06F40/279 , G06F40/40 , G06N3/08 , G06N7/01
Abstract: Using a natural language (NL) latent presentation in the automated conversion of source code from a base programming language (e.g., C++) to a target programming language (e.g., Python). A base-to-NL model can be used to generate an NL latent representation by processing a base source code snippet in the base programming language. Further, an NL-to-target model can be used to generate a target source code snippet in the target programming language (that is functionally equivalent to the base source code snippet), by processing the NL latent representation. In some implementations, output(s) from the NL-to-target model indicate canonical representation(s) of variables, and in generating the target source code snippet, technique(s) are used to match those canonical representation(s) to variable(s) of the base source code snippet. In some implementations, multiple candidate target source code snippets are generated, and a subset (e.g., one) is selected based on evaluation(s).
-
公开(公告)号:US20230022151A1
公开(公告)日:2023-01-26
申请号:US17860691
申请日:2022-07-08
Applicant: Google LLC
Inventor: Hanjun Dai , Bo Dai , Hongyu Ren , Dale Eric Schuurmans , Zihang Dai , Mengjiao Yang
Abstract: The present disclosure is directed to machine learning model architectures which provide full attention capability in each attention head while maintaining low computation and memory complexity. Specifically, according to one aspect of the present disclosure, example attention models provided herein can treat the self-attention mechanism as a conditional expectation over embeddings at each location and approximate the conditional distribution with a structured factorization. Each location can attend to all other locations, either via direct attention, or through indirect attention to group representations, which are again conditional expectations of embeddings from corresponding local regions.
-
公开(公告)号:US20240394545A1
公开(公告)日:2024-11-28
申请号:US18377368
申请日:2023-10-06
Applicant: Google LLC
Inventor: Julian Martin Eisenschlos , Xingchen Wan , Hootan Nakhost , Sercan Omer Arik , Ruoxi Sun , Hanjun Dai
IPC: G06N3/088 , G06N3/0455
Abstract: Aspects of the disclosure are directed to methods, systems, and computer readable media for universal self-adaptive prompting (USP), which includes an automatic prompt design approach specifically tailored for zero-shot learning, though still compatible with few-shot learning. To achieve universal prompting, USP categorizes a natural language processing (NLP) task into one of a plurality of possible task types and then uses a corresponding selector to select the most suitable queries and zero-shot model-generated responses as pseudo-demonstrations, thereby generalizing in-context learning to the zero-shot setup in a fully automated manner.
-
公开(公告)号:US20240362212A1
公开(公告)日:2024-10-31
申请号:US18225277
申请日:2023-07-24
Applicant: Google LLC
Inventor: Ruoxi Sun , Sercan Omer Arik , Rajarishi Sinha , Hootan Nakhost , Hanjun Dai , Pengcheng Yin
IPC: G06F16/2452 , G06F16/242
CPC classification number: G06F16/24522 , G06F16/2433
Abstract: Aspects of the disclosure are directed to methods, systems, and non-transitory computer readable media for automatically generating queries on a database from natural language text using in-context learning to leverage zero-shot and few-shot adaptation capabilities of large language models (LLMs). The methods, systems, and non-transitory computer readable media can consider database information, employ execution based consistency decoding, and employ a mixture of prompts and/or LLMs.
-
公开(公告)号:US20240289619A1
公开(公告)日:2024-08-29
申请号:US18424595
申请日:2024-01-26
Applicant: Google LLC
Inventor: Azade Nova , Hanjun Dai , Dale Eric Schuurmans
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the methods includes: obtaining data specifying an initial neural network configured to perform a machine learning task; a representativeness measure for each of a plurality of filters; determining a central tendency measure for the plurality of filters based on processing a batch of network inputs using the initial neural network; determining a cumulative importance score for each of the plurality of filters; selecting a proper subset of the plurality of filters; and generating a pruned neural network configured to perform the machine learning task.
-
公开(公告)号:US20220414067A1
公开(公告)日:2022-12-29
申请号:US17351086
申请日:2021-06-17
Applicant: Google LLC
Inventor: Hanjun Dai , Azade Nazi , Yujia Li , Bo Dai , Dale Eric Schuurmans
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating data defining a graph. In one aspect, a method comprises: sequentially generating a respective edge set for each node in the graph, wherein for each of a plurality of nodes after a first node, generating the edge set for the node comprises: receiving a context embedding for the node that summarizes a respective edge set for each node that precedes the node; generating, based on the context embedding for the node: (i) a respective edge set for the node, and (ii) a respective embedding of the edge set for the node; generating a context embedding for a next node in the ordering of the nodes using the embedding of the edge set for the node; and adding the set of edges defined by the edge set for the node to the graph.
-
公开(公告)号:US20250045577A1
公开(公告)日:2025-02-06
申请号:US18697304
申请日:2021-10-05
Applicant: Google LLC
Inventor: Bo Dai , Hanjun Dai , Yuan Xue , Zia Syed , Dale Eric Schuurmans
IPC: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing stochastic optimization using machine learning. One of the methods includes obtaining data defining a multi-stage stochastic optimization (MSSO) problem instance, the data characterizing an observation distribution, an action space, and a cost function; generating a neural network input characterizing the MSSO problem instance from the data; providing the neural network input as input to a neural network that generates, from the network input, a neural network output characterizing parameters of a value function corresponding to the MSSO problem instance; processing the neural network input using the neural network to generate the neural network output; obtaining a new observation determined according to the observation distribution for the MSSO problem instance; determining, using the value function characterized by the network output, an optimal action to take in response to the new observation; and executing the optimal action.
-
公开(公告)号:US20240249080A1
公开(公告)日:2024-07-25
申请号:US18128450
申请日:2023-03-30
Applicant: Google LLC
Inventor: Ruoxi Sun , Xingchen Wan , Hanjun Dai , Sercan Omer Arik , Tomas Pfister
CPC classification number: G06F40/40 , G06F16/3344
Abstract: Aspects of the disclosure are directed to automatically selecting examples in a prompt for an LLM to demonstrate how to perform tasks. Aspects of the disclosure can select and build a set of examples from LLM zero-shot outputs via predetermined criteria that can combine consistency, diversity, and repetition. In the zero-shot setting for three different LLMs, using only LLM predictions, aspects of the disclosure can improve performance up to 15% compared to zero-shot baselines and can match or exceed few-shot base-lines for a range of reasoning tasks.
-
-
-
-
-
-
-
-
-