-
公开(公告)号:US20230153700A1
公开(公告)日:2023-05-18
申请号:US17983130
申请日:2022-11-08
Applicant: Google LLC
Inventor: Erik Michael Lindgren , Sashank Jakkam Reddi , Ruiqi Guo , Sanjiv Kumar
IPC: G06N20/20 , G06F12/0875 , G06F12/0891
CPC classification number: G06N20/20 , G06F12/0875 , G06F12/0891
Abstract: Provided are systems and methods which more efficiency train embedding models through the use of a cache of item embeddings for candidate items over a number of training iterations. The cached item embeddings can be “stale” embeddings that were generated by a previous version of the model at a previous training iteration. Specifically, at each iteration, the (potentially stale) item embeddings included in the cache can be used when generating similarity scores that are the basis for sampling a number of items to use as negatives in the current training iteration. For example, a Gumbel-Max sampling approach can be used to sample negative items that will enable an approximation of a true gradient. New embeddings can be generated for the sampled negative items and can be used to train the model at the current iteration.
-
公开(公告)号:US20230112862A1
公开(公告)日:2023-04-13
申请号:US17960380
申请日:2022-10-05
Applicant: Google LLC
Inventor: Venkata S. Bhojanapalli , Andreas Veit , Ayan Chakrabarti , Frederick Liu , Himanshu Jain , Michal Lukasik , Sanjiv Kumar , Yin-Wen Chang
IPC: G06N3/04
Abstract: Provided are systems and methods that improve the computational efficiency of Transformers or other attention-based neural networks or machine learning models by re-using a number of attention scores between layers and/or heads of the model. To reduce the computational cost of self-attention-based models while achieving comparable or even superior results, example aspects of the present disclosure propose a novel architecture that reuses attention scores computed in one layer in one or multiple subsequent layers.
-
公开(公告)号:US11354287B2
公开(公告)日:2022-06-07
申请号:US16715620
申请日:2019-12-16
Applicant: GOOGLE LLC
Inventor: Xiang Wu , David Morris Simcha , Sanjiv Kumar , Ruiqi Guo
IPC: G06F16/20 , G06F16/22 , G06F16/27 , G06F16/248 , G06F16/28 , G06F17/16 , G06F16/2458
Abstract: Techniques of indexing a database and processing a query involve decomposing the residual term according to a projection matrix that is based on a given direction v. For example, for each database element of a partition, the residual for that database element is split into a component parallel to a given direction and a component perpendicular to that direction. The parallel component lies in a one-dimensional subspace spanned by the direction and may be efficiently quantized with a scalar quantization. The perpendicular component is quantized using multiscale quantization techniques. The quantized residual components and the center elements of each partition define the indexed database. Upon receipt of a query from a user, the inner products of q with the residual may be computed efficiently using the quantized residual components. From these inner products, the database elements that are most similar to the query are selected and returned to the user.
-
公开(公告)号:US20210073639A1
公开(公告)日:2021-03-11
申请号:US17100253
申请日:2020-11-20
Applicant: Google LLC
Inventor: Sashank Jakkam Reddi , Sanjiv Kumar , Manzil Zaheer , Zachary Charles , Zach Garrett , Keith Rush , Jakub Konecny , Hugh Brendan McMahan
Abstract: A computing system and method can be used to implement a version of federated learning (FL) that incorporates adaptivity (e.g., leverages an adaptive learning rate). In particular, the present disclosure provides a general optimization framework in which (1) clients perform multiple epochs of training using a client optimizer to minimize loss on their local data and (2) a server system updates its global model by applying a gradient-based server optimizer to the average of the clients' model updates. This framework can seamlessly incorporate adaptivity by using adaptive optimizers as client and/or server optimizers. Building upon this general framework, the present disclosure also provides example specific adaptive optimization techniques for FL which use per-coordinate methods as server optimizers. By focusing on adaptive server optimization, the use of adaptive learning rates is enabled without increase in client storage or communication costs and compatibility with cross-device FL can be ensured.
-
公开(公告)号:US20200175365A1
公开(公告)日:2020-06-04
申请号:US16657356
申请日:2019-10-18
Applicant: Google LLC
Inventor: Sashank Jakkam Reddi , Sanjiv Kumar , Manzil Zaheer , Satyen Chandrakant Kale
Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive effective learning rate while also ensuring that the effective learning rate is non-increasing.
-
公开(公告)号:US10586100B2
公开(公告)日:2020-03-10
申请号:US16298160
申请日:2019-03-11
Applicant: GOOGLE LLC
Inventor: Xiaohang Wang , Jeff Huber , Farhan Shamsi , Yakov Okshtein , Sanjiv Kumar , Henry Allan Rowley , Marcus Quintana Mitchell , Debra Lin Repenning
Abstract: Extracting financial card information with relaxed alignment comprises a method to receive an image of a card, determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones. The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio. The method then identifies a quadrilateral that matches the expected aspect ratio and performs an optical character recognition algorithm on the rectified model. A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data.
-
57.
公开(公告)号:US10510021B1
公开(公告)日:2019-12-17
申请号:US16434627
申请日:2019-06-07
Applicant: Google LLC
Inventor: Satyen Chandrakant Kale , Daniel Holtmann-Rice , Sanjiv Kumar , Enxu Yan , Xinnan Yu
Abstract: Systems and methods for evaluating a loss function or a gradient of the loss function. In one example embodiment, a computer-implemented method includes partitioning a weight matrix into a plurality of blocks. The method includes identifying a first set of labels for each of the plurality of blocks with a score greater than a first threshold value. The method includes constructing a sparse approximation of a scoring vector for each of the plurality of blocks based on the first set of labels. The method includes determining a correction value for each sparse approximation of the scoring vector. The method includes determining an approximation of a loss or a gradient of a loss associated with the scoring function based on each sparse approximation of the scoring vector and the correction value associated with the sparse approximation of the scoring vector.
-
58.
公开(公告)号:US20190378037A1
公开(公告)日:2019-12-12
申请号:US16434627
申请日:2019-06-07
Applicant: Google LLC
Inventor: Satyen Chandrakant Kale , Daniel Holtmann-Rice , Sanjiv Kumar , Enxu Yan , Xinnan Yu
Abstract: Systems and methods for evaluating a loss function or a gradient of the loss function. In one example embodiment, a computer-implemented method includes partitioning a weight matrix into a plurality of blocks. The method includes identifying a first set of labels for each of the plurality of blocks with a score greater than a first threshold value. The method includes constructing a sparse approximation of a scoring vector for each of the plurality of blocks based on the first set of labels. The method includes determining a correction value for each sparse approximation of the scoring vector. The method includes determining an approximation of a loss or a gradient of a loss associated with the scoring function based on each sparse approximation of the scoring vector and the correction value associated with the sparse approximation of the scoring vector.
-
公开(公告)号:US20190205639A1
公开(公告)日:2019-07-04
申请号:US16298160
申请日:2019-03-11
Applicant: GOOGLE LLC
Inventor: Xiaohang Wang , Jeff Huber , Farhan Shamsi , Yakov Okshtein , Sanjiv Kumar , Henry Allan Rowley , Marcus Quintana Mitchell , Debra Lin Repenning
CPC classification number: G06K9/00469 , G06K9/00463 , G06K9/2063 , G06K9/3283 , G06K9/6201 , G06K2209/01
Abstract: Extracting financial card information with relaxed alignment comprises a method to receive an image of a card, determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones. The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio. The method then identifies a quadrilateral that matches the expected aspect ratio and performs an optical character recognition algorithm on the rectified model. A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data.
-
公开(公告)号:US10055663B2
公开(公告)日:2018-08-21
申请号:US15229071
申请日:2016-08-04
Applicant: GOOGLE LLC
Inventor: Xiaohang Wang , Farhan Shamsi , Sanjiv Kumar , Henry Allan Rowley , Marcus Quintana Mitchell
IPC: G06K9/00 , G06Q20/34 , G06K9/18 , G06K7/10 , G06Q20/32 , G06K9/22 , G06K9/62 , G06K9/20 , G06Q20/36 , H04N1/00
CPC classification number: G06K9/186 , G06K7/10 , G06K9/00469 , G06K9/03 , G06K9/18 , G06K9/2054 , G06K9/228 , G06K9/6202 , G06K2209/01 , G06Q20/32 , G06Q20/3223 , G06Q20/3276 , G06Q20/34 , G06Q20/36 , H04N1/00307
Abstract: Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy.
-
-
-
-
-
-
-
-
-