Modeling of Long-Range Interactions with Reduced Feature Materialization via Lambda Functions

    公开(公告)号:US20230229886A1

    公开(公告)日:2023-07-20

    申请号:US18011636

    申请日:2021-07-07

    Applicant: Google LLC

    Inventor: Irwan Bello

    CPC classification number: G06N3/04 G06N3/08

    Abstract: The present disclosure provides systems, methods, and computer program products for performing modeling of long-range interactions with reduced feature materialization, for example, in machine learning models. A computer-implemented method may include receiving a layer input comprising input data and context data, generating one or more lambda functions based, at least in part, on a content function and a position function for each of a plurality of context elements in the context data, and applying one or more of the generated lambda functions to the input data in association with generating a layer output associated with a respective lambda layer. Experimental results for image classification on ResNet and for object detection with RetinaNet show that examples of the present disclosure significantly outperform convolutional and attentional counterparts while providing increased accuracy and efficiency.

    Modeling Dependencies with Global Self-Attention Neural Networks

    公开(公告)号:US20230359865A1

    公开(公告)日:2023-11-09

    申请号:US18044842

    申请日:2020-09-16

    Applicant: Google LLC

    CPC classification number: G06N3/045 G06N3/084

    Abstract: The present disclosure provides systems, methods, and computer program products for modeling dependencies throughout a network using a global-self attention model with a content attention layer and a positional attention layer that operate in parallel. The model receives input data comprising content values and context positions. The content attention layer generates one or more output features for each context position based on a global attention operation applied to the content values independent of the context positions. The positional attention layer generates an attention map for each of the context positions based on one or more content values of the respective context position and associated neighboring positions. Output is determined based on the output features generated by the content attention layer and the attention map generated for each context position by the positional attention layer. The model improves efficiency and can be used throughout a deep network.

    NEURAL NETWORK OPTIMIZER SEARCH
    3.
    发明申请

    公开(公告)号:US20210271970A1

    公开(公告)日:2021-09-02

    申请号:US17145524

    申请日:2021-01-11

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining update rules for training neural networks. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch defining a respective update rule; for each output sequence in the batch: training a respective instance of a child neural network using the update rule defined by the output sequence; evaluating a performance of the trained instance of the child neural network on the particular neural network task to determine a performance metric for the trained instance of the child neural network on the particular neural network task; and using the performance metrics for the trained instances of the child neural network to adjust the current values of the controller parameters of the controller neural network.

    Neural network optimizer search
    5.
    发明授权

    公开(公告)号:US10922611B2

    公开(公告)日:2021-02-16

    申请号:US16662924

    申请日:2019-10-24

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining update rules for training neural networks. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch defining a respective update rule; for each output sequence in the batch: training a respective instance of a child neural network using the update rule defined by the output sequence; evaluating a performance of the trained instance of the child neural network on the particular neural network task to determine a performance metric for the trained instance of the child neural network on the particular neural network task; and using the performance metrics for the trained instances of the child neural network to adjust the current values of the controller parameters of the controller neural network.

    NEURAL NETWORK OPTIMIZER SEARCH
    6.
    发明申请

    公开(公告)号:US20200057941A1

    公开(公告)日:2020-02-20

    申请号:US16662924

    申请日:2019-10-24

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining update rules for training neural networks. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch defining a respective update rule; for each output sequence in the batch: training a respective instance of a child neural network using the update rule defined by the output sequence; evaluating a performance of the trained instance of the child neural network on the particular neural network task to determine a performance metric for the trained instance of the child neural network on the particular neural network task; and using the performance metrics for the trained instances of the child neural network to adjust the current values of the controller parameters of the controller neural network.

    FULLY ATTENTIONAL COMPUTER VISION

    公开(公告)号:US20220215654A1

    公开(公告)日:2022-07-07

    申请号:US17606976

    申请日:2020-05-22

    Applicant: Google LLC

    Abstract: A system implemented as computer programs on one or more computers in one or more locations that implements a computer vision model is described. The computer vision model includes a positional local self-attention layer that is configured to receive an input feature map and to generate an output feature map. For each input element in the input feature map, the positional local self-attention layer generates a respective output element for the output feature map by generating a memory block including neighboring input elements around the input element, generates a query vector using the input element and a query weight matrix, for each neighboring element in the memory block, performs positional local self-attention operations to generate a temporary output element, and generates the respective output element by summing temporary output elements of the neighboring elements in the memory block.

Patent Agency Ranking