Quantization-based fast inner product search

    公开(公告)号:US10255323B1

    公开(公告)日:2019-04-09

    申请号:US14878357

    申请日:2015-10-08

    Applicant: GOOGLE INC.

    Abstract: Implementations provide an improved system for efficiently calculating inner products between a query item and a database of items. An example method includes generating a plurality of subspaces from search items in a database, the search items being represented as vectors of elements, a subspace being a block of elements from each search item that occur at the same vector position, generating a codebook for each subspace within soft constraints that are based on example queries, assigning each subspace of each search item an entry in the codebook for the subspace, the assignments for all subspaces of a search item representing a quantized search item, and storing the codebooks and the quantized search items. Generating a codebook for a particular subspace can include clustering the search item subspaces that correspond to the particular subspace, finding a cluster center for each cluster, and storing the cluster center as the codebook entry.

    EXTRACTING CARD DATA FROM MULTIPLE CARDS
    33.
    发明申请
    EXTRACTING CARD DATA FROM MULTIPLE CARDS 有权
    从多张卡提取卡片数据

    公开(公告)号:US20170039420A1

    公开(公告)日:2017-02-09

    申请号:US15297127

    申请日:2016-10-18

    Applicant: GOOGLE INC.

    Abstract: Extracting financial card information with relaxed alignment comprises a method to receive an image of a card, determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones. The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio. The method then identifies a quadrilateral that matches the expected aspect ratio and performs an optical character recognition algorithm on the rectified model. A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data.

    Abstract translation: 以轻松对准的方式提取金融卡信息包括接收卡的图像的方法,在图像的位置确定一个或多个边缘查找器区域,并识别一个或多个边缘查找器区域中的线。 该方法还识别由所识别的线的外插的交点形成的一个或多个四边形,确定一个或多个四边形的纵横比,并将确定的四边形的纵横比与预期的纵横比进行比较。 然后,该方法识别与预期宽高比匹配的四边形,并在整流模型上执行光学字符识别算法。 在图像中的多个卡上执行类似的方法。 比较每个卡的分析结果,提高数据的准确性。

    GENERATING COMPACT REPRESENTATIONS OF HIGH-DIMENSIONAL DATA
    34.
    发明申请
    GENERATING COMPACT REPRESENTATIONS OF HIGH-DIMENSIONAL DATA 有权
    生成高维数据的紧凑表示

    公开(公告)号:US20160335053A1

    公开(公告)日:2016-11-17

    申请号:US14710467

    申请日:2015-05-12

    Applicant: Google Inc.

    CPC classification number: G06F5/017 G06K9/6232 G06N99/005 H03M7/30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes receiving a plurality of high-dimensional data items; generating a circulant embedding matrix for the high-dimensional data items, wherein the circulant embedding matrix is a matrix that is fully specified by a single vector; for each high-dimensional data item, generating a compact representation of the high-dimensional data item, comprising computing a product of the circulant embedding matrix and the high dimensional data item by performing a circular convolution of the single vector that fully specifies the circulant embedding matrix and the high dimensional data item using a Fast Fourier Transform (FFT); and generating a compact representation of the high dimensional data item by computing a binary map of the computed product.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于利用外部存储器增强神经网络。 方法之一包括接收多个高维数据项; 为高维数据项生成循环嵌入矩阵,其中循环嵌入矩阵是由单个向量完全指定的矩阵; 对于每个高维数据项,生成高维数据项的紧凑表示,包括通过执行完全指定循环嵌入的单个向量的循环卷积来计算循环嵌入矩阵和高维数据项的乘积 矩阵和使用快速傅立叶变换(FFT)的高维数据项; 以及通过计算所述计算产品的二进制图来生成所述高维数据项的紧凑表示。

    Data reduction in nearest neighbor classification
    35.
    发明授权
    Data reduction in nearest neighbor classification 有权
    最近邻分类数据减少

    公开(公告)号:US09378466B2

    公开(公告)日:2016-06-28

    申请号:US14145519

    申请日:2013-12-31

    Applicant: Google Inc.

    CPC classification number: G06N99/005

    Abstract: A set S is initialized. Initially, S is empty; but, as the disclosed process is performed, items are added to it. It may contain one or more samples (e.g., items) from each class. One or more labeled samples for one or more classes may be obtained. A series of operations may be performed, iteratively, until a stopping criterion is reach to obtain the reduced set. For each class of the one or more classes, a point may be generated based on at least one sample in the class having a nearest neighbor in a set S with a different class label than the sample. The point may be added to the set S. The process may be repeated unless a stopping criterion is reached. A nearest neighbor for a submitted point in the set S may be identified and a candidate nearest neighbor may be output for the submitted point.

    Abstract translation: 一组S被初始化。 最初,S是空的 但是,随着所公开的处理被执行,项目被添加到它。 它可以包含来自每个类的一个或多个样本(例如,项目)。 可以获得用于一个或多个类别的一个或多个标记的样品。 可以迭代地执行一系列操作,直到达到停止标准以获得缩减的集合。 对于一个或多个类的每个类,可以基于类中具有与样本不同的类标签的集合S中的最近邻的至少一个样本来生成点。 该点可以被添加到集合S中。除非达到停止标准,否则可以重复该过程。 可以识别集合S中的提交点​​的最近邻,并且可以为所提交的点输出候选最近邻。

    EXTRACTING CARD DATA FROM MULTIPLE CARDS
    37.
    发明申请
    EXTRACTING CARD DATA FROM MULTIPLE CARDS 有权
    从多张卡提取卡片数据

    公开(公告)号:US20150371086A1

    公开(公告)日:2015-12-24

    申请号:US14837605

    申请日:2015-08-27

    Applicant: GOOGLE INC.

    Abstract: Extracting financial card information with relaxed alignment comprises a method to receive an image of a card, determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones. The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio. The method then identifies a quadrilateral that matches the expected aspect ratio and performs an optical character recognition algorithm on the rectified model. A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data.

    Abstract translation: 以轻松对准的方式提取金融卡信息包括接收卡的图像的方法,在图像的位置确定一个或多个边缘查找器区域,并识别一个或多个边缘查找器区域中的线。 该方法还识别由所识别的线的外插的交点形成的一个或多个四边形,确定一个或多个四边形的纵横比,并将确定的四边形的纵横比与预期的纵横比进行比较。 然后,该方法识别与预期宽高比匹配的四边形,并在整流模型上执行光学字符识别算法。 在图像中的多个卡上执行类似的方法。 比较每个卡的分析结果,提高数据的准确性。

    Shape-Gain Sketches for Fast Image Similarity Search
    38.
    发明申请
    Shape-Gain Sketches for Fast Image Similarity Search 审中-公开
    形状增益草图快速图像相似性搜索

    公开(公告)号:US20150169644A1

    公开(公告)日:2015-06-18

    申请号:US13733335

    申请日:2013-01-03

    Applicant: Google Inc.

    CPC classification number: G06F16/532

    Abstract: Separately optimizing angle error and magnitude error of a search query entered into a query database may be referred to as the “shape-gain” separation quantization. Each of a direction and a magnitude for each of a plurality of database vectors may be separately encoded. A query vector may be received. The query vector may include a query direction and a query magnitude. The separately encoded query direction, query magnitude, and each of the separately encoded direction and magnitude for each of the plurality of database vectors may be combined. Distances between the query vector and each of the plurality of database vectors may be determined. At least one of the plurality of database vectors that is similar to the query vector may be identified based on the determined distances.

    Abstract translation: 单独优化输入查询数据库的搜索查询的角度误差和幅度误差可以称为“形状增益”分离量化。 可以分别编码多个数据库向量中的每一个的方向和幅度的每一个。 可以接收查询向量。 查询向量可以包括查询方向和查询量级。 可以组合用于多个数据库向量中的每一个的单独编码的查询方向,查询量级以及单独编码的方向和幅度中的每一个。 可以确定查询向量与多个数据库向量中的每一个之间的距离。 可以基于所确定的距离来识别类似于查询向量的多个数据库向量中的至少一个。

    EXTRACTING CARD DATA USING IIN DATABASE
    39.
    发明申请
    EXTRACTING CARD DATA USING IIN DATABASE 有权
    使用IIN数据库提取卡数据

    公开(公告)号:US20150086069A1

    公开(公告)日:2015-03-26

    申请号:US14559888

    申请日:2014-12-03

    Applicant: GOOGLE INC.

    Abstract: Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy.

    Abstract translation: 提取卡数据包括由一个或多个计算设备接收卡的数字图像; 对卡的数字表示进行图像识别处理; 识别卡的数字表示中的图像; 将所识别的图像与包括多个图像的图像数据库进行比较,并确定所识别的图像与图像数据库中存储的图像匹配; 基于所识别的图像与所存储的图像匹配的确定来确定与所存储的图像相关联的卡类型并将卡类型与卡相关联; 以及对所述卡的数字表示执行特定光学字符识别算法,所述特定光学字符识别算法基于所确定的卡类型。 另一个例子是使用发行人识别号来改进数据提取。 另一个例子比较了提取的数据与用户数据,以提高准确性。

    EXTRACTING CARD DATA USING CARD ART
    40.
    发明申请
    EXTRACTING CARD DATA USING CARD ART 审中-公开
    使用卡片艺术提取卡片数据

    公开(公告)号:US20150006362A1

    公开(公告)日:2015-01-01

    申请号:US14062655

    申请日:2013-10-24

    Applicant: GOOGLE INC.

    Abstract: Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy.

    Abstract translation: 提取卡数据包括由一个或多个计算设备接收卡的数字图像; 对卡的数字表示进行图像识别处理; 识别卡的数字表示中的图像; 将所识别的图像与包括多个图像的图像数据库进行比较,并确定所识别的图像与图像数据库中存储的图像匹配; 基于所识别的图像与所存储的图像匹配的确定来确定与所存储的图像相关联的卡类型并将卡类型与卡相关联; 以及对所述卡的数字表示执行特定光学字符识别算法,所述特定光学字符识别算法基于所确定的卡类型。 另一个例子是使用发行人识别号来改进数据提取。 另一个例子比较了提取的数据与用户数据,以提高准确性。

Patent Agency Ranking