MACHINE LEARNING RANKING DISTILLATION

    公开(公告)号:US20250077934A1

    公开(公告)日:2025-03-06

    申请号:US17927105

    申请日:2022-09-23

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for training and using distilled machine learning models. In one aspect, a method includes obtaining a first input that includes training example sets that each include one or more feature values and, for each item, an outcome label that represents whether the item had a positive outcome. A first machine learning model is trained using the first input and is configured to generate a set of scores that represents whether the item will have a positive outcome when presented in the context of the training example set and with each other item in the example set. A distilled machine learning model is trained using the set of scores for each example set. The distilled machine learning model is configured to generate a distilled score.

    Minimum Deep Learning with Gating Multiplier
    13.
    发明公开

    公开(公告)号:US20240005166A1

    公开(公告)日:2024-01-04

    申请号:US18467207

    申请日:2023-09-14

    Applicant: Google LLC

    Inventor: Gil Shamir

    CPC classification number: G06N3/084 G06N3/04

    Abstract: Systems and methods according to the present disclosure can employ a computer-implemented method for inference using a machine-learned model. The method can be implemented by a computing system having one or more computing devices. The method can include obtaining data descriptive of a neural network including one or more network units and one or more gating paths, wherein each of the gating path(s) includes one or more gating units. The method can include obtaining data descriptive of one or more input features. The method can include determining one or more network unit outputs from the network unit(s) based at least in part on the input feature(s). The method can include determining one or more gating values from the gating path(s). The method can include determining one or more gated network unit outputs based at least in part on a combination of the network unit output(s) and the gating value(s).

    Asymmetric functionality activation for improved stability in neural networks

    公开(公告)号:US11475309B2

    公开(公告)日:2022-10-18

    申请号:US16847846

    申请日:2020-04-14

    Applicant: Google LLC

    Inventor: Gil Shamir

    Abstract: Thus, aspects of the present disclosure address model “blow up” by changing the functionality of the activation, thereby providing “dead” or “dying” neurons with the ability to recover from this situation. As one example, for activation functions that have an input region in which the neuron is turned off by a 0 or close to 0 gradient, a training computing system can keep the neuron turned off when the gradient pushes the unit farther into the region (e.g., by applying an update with zero or reduced magnitude). However, if the gradient for the current training example (or batch) attempts to push the unit towards a region in which the neuron is active again, the system can allow for a non-zero gradient (e.g., by applying an update with standard or increased magnitude).

    Approximate Bayesian Logistic Regression For Sparse Online Learning

    公开(公告)号:US20220108219A1

    公开(公告)日:2022-04-07

    申请号:US17492046

    申请日:2021-10-01

    Applicant: Google LLC

    Abstract: Systems and methods leverage low complexity (e.g., linear overall, fixed per example) analytical approximations to perform machine learning problems such as, for example, the sparse online logistic regression problem. Unlike variational inference and other methods, the proposed systems and methods lead to analytical closed forms, lowering the practical number of computations. Further, unlike techniques used for dense features sets, such as Gaussian Mixtures, the proposed systems and methods allow for sparse problems with huge feature sets without increasing complexity. With the analytical closed forms, there is also no need for applying stochastic gradient methods on surrogate losses, and for tuning and balancing learning and regularization parameters of such methods.

    Regularization of machine learning models

    公开(公告)号:US10600000B2

    公开(公告)日:2020-03-24

    申请号:US15368447

    申请日:2016-12-02

    Applicant: Google LLC

    Inventor: Gil Shamir

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for regularizing feature weights maintained by a machine learning model. The method includes actions of obtaining a set of training data that includes multiple training feature vectors, and training the machine learning model on each of the training feature vectors, comprising, for each feature vector and for each of a plurality of the features of the feature vector: determining a first loss for the feature vector with the feature, determining a second loss for the feature vector without the feature, and updating a current benefit score for the feature using the first loss and the second loss, wherein the benefit score for the feature is indicative of the usefulness of the feature in generating accurate predicted outcomes for training feature vectors.

Patent Agency Ranking