Training neural networks using synthetic gradients

    公开(公告)号:US11715009B2

    公开(公告)日:2023-08-01

    申请号:US16303595

    申请日:2017-05-19

    CPC classification number: G06N3/084 G06N3/044 G06N3/045

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network including a first subnetwork followed by a second subnetwork on training inputs by optimizing an objective function. In one aspect, a method includes processing a training input using the neural network to generate a training model output, including processing a subnetwork input for the training input using the first subnetwork to generate a subnetwork activation for the training input in accordance with current values of parameters of the first subnetwork, and providing the subnetwork activation as input to the second subnetwork; determining a synthetic gradient of the objective function for the first subnetwork by processing the subnetwork activation using a synthetic gradient model in accordance with current values of parameters of the synthetic gradient model; and updating the current values of the parameters of the first subnetwork using the synthetic gradient.

    Classifying input examples using a comparison set

    公开(公告)号:US11714993B2

    公开(公告)日:2023-08-01

    申请号:US17223523

    申请日:2021-04-06

    CPC classification number: G06N3/044 G06F18/217 G06F18/22 G06F18/2413 G06N3/08

    Abstract: Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.

    GENERATING PREDICTION OUTPUTS USING DYNAMIC GRAPHS

    公开(公告)号:US20210383228A1

    公开(公告)日:2021-12-09

    申请号:US17338974

    申请日:2021-06-04

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating prediction outputs characterizing a set of entities. In one aspect, a method comprises: obtaining data defining a graph, comprising: (i) a set of nodes, wherein each node represents a respective entity from the set of entities, (ii) a current set of edges, wherein each edge connects a pair of nodes, and (iii) a respective current embedding of each node; at each of a plurality of time steps: updating the respective current embedding of each node, comprising processing data defining the graph using a graph neural network; and updating the current set of edges based at least in part on the updated embeddings of the nodes; and at one or more of the plurality of time steps: generating a prediction output characterizing the set of entities based on the current embeddings of the nodes.

    Classifying input examples using a comparison set

    公开(公告)号:US10997472B2

    公开(公告)日:2021-05-04

    申请号:US16303510

    申请日:2017-05-19

    Abstract: Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.

Patent Agency Ranking