Patent search ap:("Google LLC") AND inv:"Gil Shamir" Page 1

1.

发明申请
Asymmetric Functionality Activation for Improved Stability in Neural Networks 有权

公开(公告)号：US20210319320A1

公开(公告)日：2021-10-14

申请号：US16847846

申请日：2020-04-14

Applicant: Google LLC

Inventor： Gil Shamir

IPC: G06N3/08 , G06K9/62 , G06N3/04

Abstract: Thus, aspects of the present disclosure address model “blow up” by changing the functionality of the activation, thereby providing “dead” or “dying” neurons with the ability to recover from this situation. As one example, for activation functions that have an input region in which the neuron is turned off by a 0 or close to 0 gradient, a training computing system can keep the neuron turned off when the gradient pushes the unit farther into the region (e.g., by applying an update with zero or reduced magnitude). However, if the gradient for the current training example (or batch) attempts to push the unit towards a region in which the neuron is active again, the system can allow for a non-zero gradient (e.g., by applying an update with standard or increased magnitude).

2.

发明申请
Smooth Continuous Piecewise Constructed Activation Functions 有权

公开(公告)号：US20210133565A1

公开(公告)日：2021-05-06

申请号：US16902547

申请日：2020-06-16

Applicant: Google LLC

Inventor： Gil Shamir , Dong Lin , Sergey Ioffe

IPC: G06N3/08 , G06N3/04 , G06N3/063

Abstract: Aspects of the present disclosure are directed to novel activation functions which enable improved reproducibility and accuracy tradeoffs in neural networks. In particular, the present disclosure provides a family of activation functions that, on one hand, are smooth with continuous gradient and optionally monotonic but, on the other hand, also mimic the mathematical behavior of a Rectified Linear Unit (ReLU). As examples, the activation functions described herein include a smooth rectified linear unit function and also a leaky version of such function. In various implementations, the proposed functions can provide both a complete stop region and a constant positive gradient (e.g., that can be 1) pass region like a ReLU, thereby matching accuracy performance of a ReLU. Additional implementations include a leaky version and/or functions that feature different constant gradients in the pass region.

3.

发明申请
Minimum Deep Learning with Gating Multiplier 有权

公开(公告)号：US20250028966A1

公开(公告)日：2025-01-23

申请号：US18905619

申请日：2024-10-03

Applicant: Google LLC

Inventor： Gil Shamir

IPC: G06N3/084 , G06N3/04

Abstract: Systems and methods according to the present disclosure can employ a computer-implemented method for inference using a machine-learned model. The method can be implemented by a computing system having one or more computing devices. The method can include obtaining data descriptive of a neural network including one or more network units and one or more gating paths, wherein each of the gating path(s) includes one or more gating units. The method can include obtaining data descriptive of one or more input features. The method can include determining one or more network unit outputs from the network unit(s) based at least in part on the input feature(s). The method can include determining one or more gating values from the gating path(s). The method can include determining one or more gated network unit outputs based at least in part on a combination of the network unit output(s) and the gating value(s).

4.

发明授权
Minimum deep learning with gating multiplier 有权

公开(公告)号：US12141703B2

公开(公告)日：2024-11-12

申请号：US18467207

申请日：2023-09-14

Applicant: Google LLC

Inventor： Gil Shamir

IPC: G06N3/084 , G06N3/04

Abstract: Systems and methods according to the present disclosure can employ a computer-implemented method for inference using a machine-learned model. The method can be implemented by a computing system having one or more computing devices. The method can include obtaining data descriptive of a neural network including one or more network units and one or more gating paths, wherein each of the gating path(s) includes one or more gating units. The method can include obtaining data descriptive of one or more input features. The method can include determining one or more network unit outputs from the network unit(s) based at least in part on the input feature(s). The method can include determining one or more gating values from the gating path(s). The method can include determining one or more gated network unit outputs based at least in part on a combination of the network unit output(s) and the gating value(s).

5.

发明申请
Cross-List Learning to Rank 有权

公开(公告)号：US20250061117A1

公开(公告)日：2025-02-20

申请号：US18449236

申请日：2023-08-14

Applicant: Google LLC

Inventor： Gil Shamir

IPC: G06F16/2457

Abstract: Provided are systems and methods that perform learning to rank using training data for two or more different training lists. Specifically, a training dataset can include a number of training examples. Each training example can include a query and a plurality of items that are potentially responsive to the query. The ranking model can be trained using pairs of items taken from two different training examples.

6.

发明公开
MACHINE LEARNING RANK AND PREDICTION CALIBRATION 审中-公开

公开(公告)号：US20240242106A1

公开(公告)日：2024-07-18

申请号：US17927398

申请日：2022-09-23

Applicant: Google LLC

Inventor： Gil Shamir , Zhuoshu Li

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for training and using machine learning (ML) models. In one aspect, a method includes receiving a digital component request. A first ML model can output scores indicating a likelihood of a positive outcome for digital components. Input data can be provided to a second ML model and can include feature values for a subset of digital components that were selected based on the output scores. The second ML model can be trained to output an engagement predictions and/or ranking of digital components based at least in part on feature values of digital components that will be provided together as recommendations, and can produce a second output that includes ranking and engagement predictions of the digital components in the subset of digital components. At least one digital component can be provided based on the second output.

7.

发明授权
Minimum deep learning with gating multiplier 有权

公开(公告)号：US11790236B2

公开(公告)日：2023-10-17

申请号：US16809096

申请日：2020-03-04

Applicant: Google LLC

Inventor： Gil Shamir

IPC: G06N3/084 , G06N3/04

CPC classification number: G06N3/084 , G06N3/04

Abstract: Systems and methods according to the present disclosure can employ a computer-implemented method for inference using a machine-learned model. The method can be implemented by a computing system having one or more computing devices. The method can include obtaining data descriptive of a neural network including one or more network units and one or more gating paths, wherein each of the gating path(s) includes one or more gating units. The method can include obtaining data descriptive of one or more input features. The method can include determining one or more network unit outputs from the network unit(s) based at least in part on the input feature(s). The method can include determining one or more gating values from the gating path(s). The method can include determining one or more gated network unit outputs based at least in part on a combination of the network unit output(s) and the gating value(s).

8.

发明公开
TRAINING MACHINE LEARNING MODELS USING QUANTILE AND MEDIAN RANKING DISTILLATION 审中-公开

公开(公告)号：US20230252281A1

公开(公告)日：2023-08-10

申请号：US17830561

申请日：2022-06-02

Applicant: Google LLC

Inventor： Gil Shamir , Zhuoshu Li

IPC: G06N3/08 , G06N3/04 , G06K9/62

CPC classification number: G06N3/08 , G06N3/0454 , G06K9/6265 , G06K9/6298

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that obtain a first machine learning model that is configured to output a score. The training examples can each include feature values that represent features of an item, and an outcome label for the item. From the training examples, training pairs of training examples are determined. For each training pair: (i) a score is generated for each training example in the training pair using the first machine learning model; and (ii) for the training pair, a score difference of the scores generated for the training examples in the training pair is determined. Using the training pairs and the score differences, a second machine learning model is trained to produce score differences that, for the same training examples, are within a threshold value of the score differences produced by the first machine learning model.

9.

发明申请
Minimum Deep Learning with Gating Multiplier 有权

公开(公告)号：US20210279591A1

公开(公告)日：2021-09-09

申请号：US16809096

申请日：2020-03-04

Applicant: Google LLC

Inventor： Gil Shamir

IPC: G06N3/08 , G06N3/04

Abstract: Systems and methods according to the present disclosure can employ a computer-implemented method for inference using a machine-learned model. The method can be implemented by a computing system having one or more computing devices. The method can include obtaining data descriptive of a neural network including one or more network units and one or more gating paths, wherein each of the gating path(s) includes one or more gating units. The method can include obtaining data descriptive of one or more input features. The method can include determining one or more network unit outputs from the network unit(s) based at least in part on the input feature(s). The method can include determining one or more gating values from the gating path(s). The method can include determining one or more gated network unit outputs based at least in part on a combination of the network unit output(s) and the gating value(s).

10.

发明申请
Distilling from Ensembles to Improve Reproducibility of Neural Networks 有权

公开(公告)号：US20210158156A1

公开(公告)日：2021-05-27

申请号：US17025418

申请日：2020-09-18

Applicant: Google LLC

Inventor： Gil Shamir , Lorenzo Coviello

IPC: G06N3/08 , G06N3/04

Abstract: Systems and methods can improve the reproducibility of neural networks by distilling from ensembles. In particular, aspects of the present disclosure are directed to a training scheme that utilizes a combination of an ensemble of neural networks and a single, “wide” neural network that is more powerful (e.g., exhibits a greater accuracy) than the ensemble. Specifically, the output of the ensemble can be distilled into the single neural network during training of the single neural network. After training, the single neural network can be deployed to generate inferences. In such fashion, the single neural model can provide a superior prediction accuracy while, during training, the ensemble can serve to influence the single neural network to be more reproducible. In addition, an additional single wide tower can be added to generate another output, that can be distilled to the single neural network, to further improve its accuracy.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification