Neural Networks based Multimodal Transformer for Multi-Task User Interface Modeling

    公开(公告)号:US20230031702A1

    公开(公告)日:2023-02-02

    申请号:US17812208

    申请日:2022-07-13

    Applicant: Google LLC

    Abstract: A method includes receiving, via a computing device, a screenshot of a display provided by a graphical user interface of the computing device. The method also includes generating, by an image-structure transformer of a neural network, a representation by fusing a first embedding based on the screenshot and a second embedding based on a layout of virtual objects in the screenshot. The method additionally includes predicting, by the neural network and based on the generated representation, a modeling task output associated with the graphical user interface. The method further includes providing, by the computing device, the predicted modeling task output.

    DECREASING NEURAL NETWORK INFERENCE TIMES USING SOFTMAX APPROXIMATION

    公开(公告)号:US20200104686A1

    公开(公告)日:2020-04-02

    申请号:US16586702

    申请日:2019-09-27

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for decreasing neural network inference times using softmax approximation. One of the methods includes maintaining data specifying a respective softmax weight vector for each output in a vocabulary of possible neural network outputs; receiving a neural network input; processing the neural network input using one or more initial neural network layers to generate a context vector for the neural network input; and generating an approximate score distribution over the vocabulary of possible neural network outputs for the neural network input, comprising: processing the context vector using a screening model configured to predict a proper subset of the vocabulary for the context input; and generating a respective logit for each output that is in the proper subset, comprising applying the softmax weight vector for the output to the context vector.

    IMAGE PROCESSING NEURAL NETWORKS WITH DYNAMIC FILTER ACTIVATION

    公开(公告)号:US20220004849A1

    公开(公告)日:2022-01-06

    申请号:US17295561

    申请日:2019-11-20

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using neural networks. One of the methods includes receiving a network input; processing the network input through a gater neural network to generate a gating vector that includes a respective value for each of a plurality of filters; determining, from the gating vector and for each of the plurality of filters, whether the filter is active or inactive; and processing the network input through the main convolutional neural network to generate an image processing output, comprising, for each convolutional layer in the first plurality of convolutional layers: receiving an input feature map for the convolutional layer; and generating an output feature map, the generating comprising: for each filter of the convolutional layer that is inactive: setting the output channel for the filter to have all zero elements.

Patent Agency Ranking