-
公开(公告)号:US20240273336A1
公开(公告)日:2024-08-15
申请号:US18430483
申请日:2024-02-01
Applicant: Google LLC
Inventor: Mingxing Tan , Quoc Le , Bo Chen , Vijay Vasudevan , Ruoming Pang
Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
-
公开(公告)号:US20230274532A1
公开(公告)日:2023-08-31
申请号:US18313772
申请日:2023-05-08
Applicant: Google LLC
Inventor: Jon Shlens , Ekin Dogus Cubuk , Quoc Le , Tsung-Yi Lin , Barret Zoph , Golnaz Ghiasi
CPC classification number: G06V10/772 , G06F18/217 , G06F18/24 , G06T3/20 , G06T3/60 , G06T11/001
Abstract: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.
-
公开(公告)号:US20220215682A1
公开(公告)日:2022-07-07
申请号:US17702438
申请日:2022-03-23
Applicant: Google LLC
Inventor: Jon Shlens , Ekin Dogus Cubuk , Quoc Le , Tsung-Yi Lin , Barret Zoph , Golnaz Ghiasi
IPC: G06V30/194 , G06K9/62 , G06T3/60 , G06T3/20 , G06T11/00
Abstract: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.
-
公开(公告)号:US20190354808A1
公开(公告)日:2019-11-21
申请号:US16416888
申请日:2019-05-20
Applicant: Google LLC
Inventor: Daniel Sung-Joon Park , Quoc Le , William Chan , Ekin Dogus Cubuk , Barret Zoph , Yu Zhang , Chung-Cheng Chiu
Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.
-
公开(公告)号:US12293276B2
公开(公告)日:2025-05-06
申请号:US18430483
申请日:2024-02-01
Applicant: Google LLC
Inventor: Mingxing Tan , Quoc Le , Bo Chen , Vijay Vasudevan , Ruoming Pang
Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.
-
6.
公开(公告)号:US20240378394A1
公开(公告)日:2024-11-14
申请号:US18231650
申请日:2023-08-08
Applicant: GOOGLE LLC
Inventor: Ragha Kotikalapudi , Chen Zhu , Steven Zheng , Sahitya Potluri , Yu Du , Heng-Tze Cheng , Quoc Le , Ed H. Chi
Abstract: Implementations described herein relate to using self-evaluation when utilizing a large language model (LLM) to generate a response to a natural language (NL) based input. The LLM can be used to process the NL based input to generate a plurality of responses, and to generate a critique of those responses by comparing the responses to a set of response evaluation criteria. One of the responses can then be selected based on the comparison with the set of response evaluation criteria which can vary from one NL based input to another. If the NL based input was obtained a user of a client device during an inference stage, then the selected response can be rendered for presentation to the user. If the NL based input was obtained during a training stage, then the selected response can be stored as a training instance and optionally in association with additional data.
-
公开(公告)号:US20230359898A1
公开(公告)日:2023-11-09
申请号:US18350464
申请日:2023-07-11
Applicant: Google LLC
Inventor: Daniel Sung-Joon Park , Quoc Le , William Chan , Ekin Dogus Cubuk , Barret Zoph , Yu Zhang , Chung-Cheng Chiu
CPC classification number: G06N3/084 , G06N20/00 , G10L15/16 , G10L15/063 , G10L15/12 , G06V10/7747 , G10L15/28 , G06V10/82 , G06F18/2148
Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.
-
公开(公告)号:US20230074406A1
公开(公告)日:2023-03-09
申请号:US17532794
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Vikram Sridar , Daniel De Freitas Adiwardana , Noam M. Shazeer , Quoc Le
IPC: G10L15/183 , G10L15/22
Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.
-
公开(公告)号:US11301733B2
公开(公告)日:2022-04-12
申请号:US16416848
申请日:2019-05-20
Applicant: Google LLC
Inventor: Jon Shlens , Ekin Dogus Cubuk , Quoc Le , Tsung-Yi Lin , Barret Zoph , Golnaz Ghiasi
Abstract: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.
-
公开(公告)号:US20250086405A1
公开(公告)日:2025-03-13
申请号:US18481803
申请日:2023-10-05
Applicant: GOOGLE LLC
Inventor: Swaroop Mishra , Ragha Kotikalapudi , Obaid Sarvana , Sahitya Potluri , YaGuang Li , Taylor Bos , Steven Zheng , Hanzhao Lin , Chenkai Kuang , Heng-Tze Cheng , Ed H. Chi , Quoc Le
Abstract: Some implementations relate to generating a training and/or evaluation dataset with LLM prompts (e.g., derived from user queries) based on a prompt complexity. An input prompt, for example derived from a user query, is received. The input prompt is decomposed into a prompt tree comprising a plurality of nodes. The plurality of nodes comprise: a plurality of leaf nodes corresponding to simple sub-prompts of the input query; a plurality of branch nodes of sub-prompts each corresponding to multiple simple sub-prompts; and a root node corresponding to the input prompt. A prompt complexity is determined based on a path length of the prompt tree. The prompt complexity is compared to a threshold complexity. If the prompt complexity is above the threshold complexity, the input prompt is included in a set of training prompts and/or a set of evaluation prompts.
-
-
-
-
-
-
-
-
-