Learning Data Augmentation Strategies for Object Detection

    公开(公告)号:US20230274532A1

    公开(公告)日:2023-08-31

    申请号:US18313772

    申请日:2023-05-08

    Applicant: Google LLC

    Abstract: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.

    Learning Data Augmentation Strategies for Object Detection

    公开(公告)号:US20220215682A1

    公开(公告)日:2022-07-07

    申请号:US17702438

    申请日:2022-03-23

    Applicant: Google LLC

    Abstract: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.

    Neural architecture search with factorized hierarchical search space

    公开(公告)号:US12293276B2

    公开(公告)日:2025-05-06

    申请号:US18430483

    申请日:2024-02-01

    Applicant: Google LLC

    Abstract: The present disclosure is directed to an automated neural architecture search approach for designing new neural network architectures such as, for example, resource-constrained mobile CNN models. In particular, the present disclosure provides systems and methods to perform neural architecture search using a novel factorized hierarchical search space that permits layer diversity throughout the network, thereby striking the right balance between flexibility and search space size. The resulting neural architectures are able to be run relatively faster and using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art mobile-optimized models.

    REDUCING COMPUTATIONAL RESOURCE USAGE VIA TRAINING AND/OR UTILIZING LARGE LANGUAGE MODELS

    公开(公告)号:US20240378394A1

    公开(公告)日:2024-11-14

    申请号:US18231650

    申请日:2023-08-08

    Applicant: GOOGLE LLC

    Abstract: Implementations described herein relate to using self-evaluation when utilizing a large language model (LLM) to generate a response to a natural language (NL) based input. The LLM can be used to process the NL based input to generate a plurality of responses, and to generate a critique of those responses by comparing the responses to a set of response evaluation criteria. One of the responses can then be selected based on the comparison with the set of response evaluation criteria which can vary from one NL based input to another. If the NL based input was obtained a user of a client device during an inference stage, then the selected response can be rendered for presentation to the user. If the NL based input was obtained during a training stage, then the selected response can be stored as a training instance and optionally in association with additional data.

    USING LARGE LANGUAGE MODEL(S) IN GENERATING AUTOMATED ASSISTANT RESPONSE(S

    公开(公告)号:US20230074406A1

    公开(公告)日:2023-03-09

    申请号:US17532794

    申请日:2021-11-22

    Applicant: GOOGLE LLC

    Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.

    Learning data augmentation strategies for object detection

    公开(公告)号:US11301733B2

    公开(公告)日:2022-04-12

    申请号:US16416848

    申请日:2019-05-20

    Applicant: Google LLC

    Abstract: Example aspects of the present disclosure are directed to systems and methods for learning data augmentation strategies for improved object detection model performance. In particular, example aspects of the present disclosure are directed to iterative reinforcement learning approaches in which, at each of a plurality of iterations, a controller model selects a series of one or more augmentation operations to be applied to training images to generate augmented images. For example, the controller model can select the augmentation operations from a defined search space of available operations which can, for example, include operations that augment the training image without modification of the locations of a target object and corresponding bounding shape within the image and/or operations that do modify the locations of the target object and bounding shape within the training image.

    PROMPT COMPLEXITY FOR LARGE LANGUAGE MODELS

    公开(公告)号:US20250086405A1

    公开(公告)日:2025-03-13

    申请号:US18481803

    申请日:2023-10-05

    Applicant: GOOGLE LLC

    Abstract: Some implementations relate to generating a training and/or evaluation dataset with LLM prompts (e.g., derived from user queries) based on a prompt complexity. An input prompt, for example derived from a user query, is received. The input prompt is decomposed into a prompt tree comprising a plurality of nodes. The plurality of nodes comprise: a plurality of leaf nodes corresponding to simple sub-prompts of the input query; a plurality of branch nodes of sub-prompts each corresponding to multiple simple sub-prompts; and a root node corresponding to the input prompt. A prompt complexity is determined based on a path length of the prompt tree. The prompt complexity is compared to a threshold complexity. If the prompt complexity is above the threshold complexity, the input prompt is included in a set of training prompts and/or a set of evaluation prompts.

Patent Agency Ranking