Efficient Training of Embedding Models Using Negative Cache

    公开(公告)号:US20230153700A1

    公开(公告)日:2023-05-18

    申请号:US17983130

    申请日:2022-11-08

    Applicant: Google LLC

    CPC classification number: G06N20/20 G06F12/0875 G06F12/0891

    Abstract: Provided are systems and methods which more efficiency train embedding models through the use of a cache of item embeddings for candidate items over a number of training iterations. The cached item embeddings can be “stale” embeddings that were generated by a previous version of the model at a previous training iteration. Specifically, at each iteration, the (potentially stale) item embeddings included in the cache can be used when generating similarity scores that are the basis for sampling a number of items to use as negatives in the current training iteration. For example, a Gumbel-Max sampling approach can be used to sample negative items that will enable an approximation of a true gradient. New embeddings can be generated for the sampled negative items and can be used to train the model at the current iteration.

    Federated Learning with Adaptive Optimization

    公开(公告)号:US20210073639A1

    公开(公告)日:2021-03-11

    申请号:US17100253

    申请日:2020-11-20

    Applicant: Google LLC

    Abstract: A computing system and method can be used to implement a version of federated learning (FL) that incorporates adaptivity (e.g., leverages an adaptive learning rate). In particular, the present disclosure provides a general optimization framework in which (1) clients perform multiple epochs of training using a client optimizer to minimize loss on their local data and (2) a server system updates its global model by applying a gradient-based server optimizer to the average of the clients' model updates. This framework can seamlessly incorporate adaptivity by using adaptive optimizers as client and/or server optimizers. Building upon this general framework, the present disclosure also provides example specific adaptive optimization techniques for FL which use per-coordinate methods as server optimizers. By focusing on adaptive server optimization, the use of adaptive learning rates is enabled without increase in client storage or communication costs and compatibility with cross-device FL can be ensured.

    Controlled Adaptive Optimization
    13.
    发明申请

    公开(公告)号:US20200175365A1

    公开(公告)日:2020-06-04

    申请号:US16657356

    申请日:2019-10-18

    Applicant: Google LLC

    Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive effective learning rate while also ensuring that the effective learning rate is non-increasing.

Patent Agency Ranking