-
公开(公告)号:US20180089590A1
公开(公告)日:2018-03-29
申请号:US15708793
申请日:2017-09-19
Applicant: Google Inc.
Inventor: Ananda Theertha Suresh , Sanjiv Kumar , Hugh Brendan McMahan , Xinnan Yu
CPC classification number: G06N20/00 , G06F17/12 , G06F17/16 , G06F17/18 , G06N7/005 , H03M7/3059 , H03M7/3082 , H03M7/40 , H04L67/42
Abstract: The present disclosure provides systems and methods for communication efficient distributed mean estimation. In particular, aspects of the present disclosure can be implemented by a system in which a number of vectors reside on a number of different clients, and a centralized server device seeks to estimate the mean of such vectors. According to one aspect of the present disclosure, a client computing device can rotate a vector by a random rotation matrix and then subsequently perform probabilistic quantization on the rotated vector. According to another aspect of the present disclosure, subsequent to quantization but prior to transmission, the client computing can encode the quantized vector according to a variable length coding scheme (e.g., by computing variable length codes).
-
公开(公告)号:US09904873B2
公开(公告)日:2018-02-27
申请号:US15364222
申请日:2016-11-29
Applicant: GOOGLE INC.
Inventor: Sanjiv Kumar , Henry Allan Rowley , Xiaohang Wang , Jose Jeronimo Moreira Rodrigues
IPC: G06K9/00 , G06K9/62 , G06K9/18 , G06K9/66 , G06T3/00 , G06Q20/22 , G06Q20/34 , G07F7/08 , G06K9/32 , G06K9/46 , G06T7/11
CPC classification number: G06K9/6269 , G06K9/00469 , G06K9/00536 , G06K9/18 , G06K9/186 , G06K9/3233 , G06K9/3258 , G06K9/46 , G06K9/6202 , G06K9/6267 , G06K9/66 , G06K2009/4666 , G06K2209/01 , G06Q20/227 , G06Q20/34 , G06T3/0012 , G06T7/11 , G06T2207/20132 , G07F7/0893
Abstract: Embodiments herein provide computer-implemented techniques for allowing a user computing device to extract financial card information using optical character recognition (“OCR”). Extracting financial card information may be improved by applying various classifiers and other transformations to the image data. For example, applying a linear classifier to the image to determine digit locations before applying the OCR algorithm allows the user computing device to use less processing capacity to extract accurate card data. The OCR application may train a classifier to use the wear patterns of a card to improve OCR algorithm performance. The OCR application may apply a linear classifier and then a nonlinear classifier to improve the performance and the accuracy of the OCR algorithm. The OCR application uses the known digit patterns used by typical credit and debit cards to improve the accuracy of the OCR algorithm.
-
公开(公告)号:US09870199B2
公开(公告)日:2018-01-16
申请号:US14710467
申请日:2015-05-12
Applicant: Google Inc.
Inventor: Sanjiv Kumar , Xinnan Yu
CPC classification number: G06F5/017 , G06K9/6232 , G06N99/005 , H03M7/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes receiving a plurality of high-dimensional data items; generating a circulant embedding matrix for the high-dimensional data items, wherein the circulant embedding matrix is a matrix that is fully specified by a single vector; for each high-dimensional data item, generating a compact representation of the high-dimensional data item, comprising computing a product of the circulant embedding matrix and the high dimensional data item by performing a circular convolution of the single vector that fully specifies the circulant embedding matrix and the high dimensional data item using a Fast Fourier Transform (FFT); and generating a compact representation of the high dimensional data item by computing a binary map of the computed product.
-
公开(公告)号:US09858922B2
公开(公告)日:2018-01-02
申请号:US14311557
申请日:2014-06-23
Applicant: Google Inc.
Inventor: Eugene Weinstein , Sanjiv Kumar , Ignacio L. Moreno , Andrew W. Senior , Nikhil Prasad Bhat
CPC classification number: G10L15/08 , G10L15/285
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for caching speech recognition scores. In some implementations, one or more values comprising data about an utterance are received. An index value is determined for the one or more values. An acoustic model score for the one or more received values is selected, from a cache of acoustic model scores that were computed before receiving the one or more values, based on the index value. A transcription for the utterance is determined using the selected acoustic model score.
-
公开(公告)号:US20170091240A1
公开(公告)日:2017-03-30
申请号:US14951909
申请日:2015-11-25
Applicant: Google Inc.
Inventor: Xinnan Yu , Sanjiv Kumar , Ruiqi Guo
IPC: G06F17/30
CPC classification number: G06F16/2237 , G06F16/3331 , G06F16/951
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for efficiently performing linear projections. In one aspect, a method includes actions for obtaining a plurality of content items from one or more content sources. Additional actions include, extracting a plurality of features from each of the plurality of content items, generating a feature vector for each of the extracted features in order to create a search space, generating a series of element matrices based upon the generated feature vectors, transforming the series of element matrices into a structured matrix such that the transformation preserves one or more relationships associated with each element matrix of the series of element matrices, receiving a search object, searching the enhanced search space based on the received search object, provided one or more links to a content item that are responsive to the search object.
-
66.
公开(公告)号:US20170046668A1
公开(公告)日:2017-02-16
申请号:US14827330
申请日:2015-08-16
Applicant: GOOGLE INC.
Inventor: Henry Allan Rowley , Sanjiv Kumar , Xiaofeng Liu , Brian Lin Chang , Daniel Niels Holtmann-Rice
CPC classification number: G06Q20/10 , G06K9/00442 , G06K9/03 , G06K9/2063 , G06K9/723 , G06K2209/01 , G06Q20/36
Abstract: An application extracts a user name from a financial card image using optical character recognition (“OCR”) and compares segments of the user name to names stored in user data to refine the extracted name. The application performs an OCR algorithm on a card image and compares an extracted name with user data. The application identifies likely matching names to the extracted name. The OCR application breaks the extracted name into one or more series of segments and compares the segments from the extracted name to segments from the stored names. The OCR application determines an edit distance between the extracted name and each potentially matching stored name. If the edit distance is below a configured threshold then the OCR application revises the extracted name to match the identified stored name. The refined name is presented to the user for verification.
Abstract translation: 应用程序使用光学字符识别(“OCR”)从金融卡片图像中提取用户名,并将用户名的段与存储在用户数据中的名称进行比较,以优化提取的名称。 应用程序在卡片图像上执行OCR算法,并将提取的名称与用户数据进行比较。 该应用程序可识别提取的名称可能匹配的名称。 OCR应用将提取的名称分解为一个或多个片段,并将来自提取的名称的片段与存储的名称的片段进行比较。 OCR应用程序确定提取的名称和每个潜在匹配的存储名称之间的编辑距离。 如果编辑距离低于配置的阈值,则OCR应用程序将修改提取的名称以匹配所标识的存储名称。 将精简的名称呈现给用户进行验证。
-
公开(公告)号:US20160342853A1
公开(公告)日:2016-11-24
申请号:US15229071
申请日:2016-08-04
Applicant: GOOGLE INC.
Inventor: Xiaohang Wang , Farhan Shamsi , Sanjiv Kumar , Henry Allan Rowley , Marcus Quintana Mitchell
CPC classification number: G06K9/186 , G06K7/10 , G06K9/00469 , G06K9/18 , G06K9/2054 , G06K9/228 , G06K9/6202 , G06K2209/01 , G06Q20/32 , G06Q20/3223 , G06Q20/3276 , G06Q20/34 , G06Q20/36 , H04N1/00307
Abstract: Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy.
-
公开(公告)号:US20160103842A1
公开(公告)日:2016-04-14
申请号:US14512893
申请日:2014-10-13
Applicant: GOOGLE INC.
Inventor: Krzysztof Marcin Choromanski , Sanjiv Kumar
IPC: G06F17/30
CPC classification number: G06K9/6255 , G06K9/622
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for clustering data points. One of the methods includes maintaining data representing a respective ordered tuple of skeleton data points for each of a plurality of clusters. One or more intersecting clusters are determined for a new data point. An updated tuple of skeleton data points is generated for an updated cluster by selecting updated skeleton data points, including selecting the new data point or an existing jth skeleton data point of one of the one or more intersecting clusters according to which random value, of the jth random value for the new data point or the random value for the jth existing skeleton data point, is closest to a limiting value. The new data point is then assigned to the updated cluster.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于聚类数据点。 方法之一包括维护表示多个聚类中的每一个的骨架数据点的相应有序元组的数据。 为新的数据点确定一个或多个相交的聚类。 通过选择更新的骨架数据点,为更新的簇生成更新的骨架数据点的元组,包括根据哪个随机值选择一个或多个相交簇中的一个或多个相交簇的新数据点或现有第j个骨架数据点 新数据点的第j个随机值或第j个现有骨架数据点的随机值最接近极限值。 然后将新数据点分配给更新的集群。
-
公开(公告)号:US20150186787A1
公开(公告)日:2015-07-02
申请号:US14143710
申请日:2013-12-30
Applicant: Google Inc.
Inventor: Sanjiv Kumar , Brian Kernighan
CPC classification number: G06N99/005 , G06F17/30011 , G06Q10/107
Abstract: Plagiarism may be detected, as disclosed herein, utilizing a database that stores documents for one or more courses. The database may restrict sharing of content between documents. A feature extraction module may receive edits and timestamp the edits to the document. A writing pattern for a particular user or group of users may be discerned from the temporal data and the documents for the particular user or group of users. A feature vector may be generated that represents the writing pattern. A machine learning technique may be applied to the feature vector to determine whether or not a document is plagiarized.
Abstract translation: 可以如本文所公开的那样利用存储用于一个或多个课程的文档的数据库来检测抄袭。 数据库可能限制文档之间的内容共享。 特征提取模块可以接收对文档的编辑和时间戳。 可以从特定用户或用户组的时间数据和文档中辨别特定用户或用户组的写入模式。 可以生成表示写入模式的特征向量。 可以将机器学习技术应用于特征向量以确定文档是否被剽窃。
-
公开(公告)号:US20150161174A1
公开(公告)日:2015-06-11
申请号:US14330195
申请日:2014-07-14
Applicant: Google Inc.
Inventor: Sanjiv Kumar , Henry Allan Rowley , Ameesh Makadia
IPC: G06F17/30
CPC classification number: G06F17/30274 , G06F17/30247 , G06K9/6215 , G06K9/6224 , G06K2209/27
Abstract: Methods, systems, and apparatus, including computer program products, for ranking search results for queries. The method includes calculating a visual similarity score for one or more pairs of images in a plurality of images based on visual features of images in each of the one or more pairs; building a graph of images by linking each of one or more images in the plurality of images to one or more nearest neighbor images based on the visual similarity scores; associating a respective score with each of one or more images in the graph based on data indicative of user behavior relative to the image as a search result for a query; and determining a new score for each of one or more images in the graph based on the respective score of the image, and the respective scores of one or more nearest neighbors to the image.
Abstract translation: 方法,系统和装置,包括计算机程序产品,用于对查询的搜索结果进行排名。 该方法包括基于一个或多个对中的每一个中的图像的视觉特征来计算多个图像中的一对或多对图像的视觉相似性分数; 通过基于所述视觉相似性得分将所述多个图像中的一个或多个图像的每一个链接到一个或多个最近邻图像来构建图像的图; 基于表示用户相对于图像的行为的数据作为查询的搜索结果,将各个分数与图中的一个或多个图像中的每一个相关联; 以及基于所述图像的相应分数以及所述图像的一个或多个最近邻居的各个分数来确定所述图中的一个或多个图像中的每一个的新分数。
-
-
-
-
-
-
-
-
-