Efficient Training of Embedding Models Using Negative Cache

    公开(公告)号:US20230153700A1

    公开(公告)日:2023-05-18

    申请号:US17983130

    申请日:2022-11-08

    Applicant: Google LLC

    CPC classification number: G06N20/20 G06F12/0875 G06F12/0891

    Abstract: Provided are systems and methods which more efficiency train embedding models through the use of a cache of item embeddings for candidate items over a number of training iterations. The cached item embeddings can be “stale” embeddings that were generated by a previous version of the model at a previous training iteration. Specifically, at each iteration, the (potentially stale) item embeddings included in the cache can be used when generating similarity scores that are the basis for sampling a number of items to use as negatives in the current training iteration. For example, a Gumbel-Max sampling approach can be used to sample negative items that will enable an approximation of a true gradient. New embeddings can be generated for the sampled negative items and can be used to train the model at the current iteration.

    Local orthogonal decomposition for maximum inner product search

    公开(公告)号:US11354287B2

    公开(公告)日:2022-06-07

    申请号:US16715620

    申请日:2019-12-16

    Applicant: GOOGLE LLC

    Abstract: Techniques of indexing a database and processing a query involve decomposing the residual term according to a projection matrix that is based on a given direction v. For example, for each database element of a partition, the residual for that database element is split into a component parallel to a given direction and a component perpendicular to that direction. The parallel component lies in a one-dimensional subspace spanned by the direction and may be efficiently quantized with a scalar quantization. The perpendicular component is quantized using multiscale quantization techniques. The quantized residual components and the center elements of each partition define the indexed database. Upon receipt of a query from a user, the inner products of q with the residual may be computed efficiently using the quantized residual components. From these inner products, the database elements that are most similar to the query are selected and returned to the user.

    Federated Learning with Adaptive Optimization

    公开(公告)号:US20210073639A1

    公开(公告)日:2021-03-11

    申请号:US17100253

    申请日:2020-11-20

    Applicant: Google LLC

    Abstract: A computing system and method can be used to implement a version of federated learning (FL) that incorporates adaptivity (e.g., leverages an adaptive learning rate). In particular, the present disclosure provides a general optimization framework in which (1) clients perform multiple epochs of training using a client optimizer to minimize loss on their local data and (2) a server system updates its global model by applying a gradient-based server optimizer to the average of the clients' model updates. This framework can seamlessly incorporate adaptivity by using adaptive optimizers as client and/or server optimizers. Building upon this general framework, the present disclosure also provides example specific adaptive optimization techniques for FL which use per-coordinate methods as server optimizers. By focusing on adaptive server optimization, the use of adaptive learning rates is enabled without increase in client storage or communication costs and compatibility with cross-device FL can be ensured.

    Controlled Adaptive Optimization
    55.
    发明申请

    公开(公告)号:US20200175365A1

    公开(公告)日:2020-06-04

    申请号:US16657356

    申请日:2019-10-18

    Applicant: Google LLC

    Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive effective learning rate while also ensuring that the effective learning rate is non-increasing.

    Extracting card data from multiple cards

    公开(公告)号:US10586100B2

    公开(公告)日:2020-03-10

    申请号:US16298160

    申请日:2019-03-11

    Applicant: GOOGLE LLC

    Abstract: Extracting financial card information with relaxed alignment comprises a method to receive an image of a card, determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones. The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio. The method then identifies a quadrilateral that matches the expected aspect ratio and performs an optical character recognition algorithm on the rectified model. A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data.

    Systems and methods for evaluating a loss function or a gradient of a loss function via dual decomposition

    公开(公告)号:US10510021B1

    公开(公告)日:2019-12-17

    申请号:US16434627

    申请日:2019-06-07

    Applicant: Google LLC

    Abstract: Systems and methods for evaluating a loss function or a gradient of the loss function. In one example embodiment, a computer-implemented method includes partitioning a weight matrix into a plurality of blocks. The method includes identifying a first set of labels for each of the plurality of blocks with a score greater than a first threshold value. The method includes constructing a sparse approximation of a scoring vector for each of the plurality of blocks based on the first set of labels. The method includes determining a correction value for each sparse approximation of the scoring vector. The method includes determining an approximation of a loss or a gradient of a loss associated with the scoring function based on each sparse approximation of the scoring vector and the correction value associated with the sparse approximation of the scoring vector.

    Systems and Methods for Evaluating a Loss Function or a Gradient of a Loss Function via Dual Decomposition

    公开(公告)号:US20190378037A1

    公开(公告)日:2019-12-12

    申请号:US16434627

    申请日:2019-06-07

    Applicant: Google LLC

    Abstract: Systems and methods for evaluating a loss function or a gradient of the loss function. In one example embodiment, a computer-implemented method includes partitioning a weight matrix into a plurality of blocks. The method includes identifying a first set of labels for each of the plurality of blocks with a score greater than a first threshold value. The method includes constructing a sparse approximation of a scoring vector for each of the plurality of blocks based on the first set of labels. The method includes determining a correction value for each sparse approximation of the scoring vector. The method includes determining an approximation of a loss or a gradient of a loss associated with the scoring function based on each sparse approximation of the scoring vector and the correction value associated with the sparse approximation of the scoring vector.

Patent Agency Ranking