Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for efficiently performing linear projections. In one aspect, a method includes actions for obtaining a plurality of content items from one or more content sources. Additional actions include, extracting a plurality of features from each of the plurality of content items, generating a feature vector for each of the extracted features in order to create a search space, generating a series of element matrices based upon the generated feature vectors, transforming the series of element matrices into a structured matrix such that the transformation preserves one or more relationships associated with each element matrix of the series of element matrices, receiving a search object, searching the enhanced search space based on the received search object, provided one or more links to a content item that are responsive to the search object.
Abstract:
An application extracts a user name from a financial card image using optical character recognition ("OCR") and compares segments of the user name to names stored in user data to refine the extracted name. The application performs an OCR algorithm on a card image and compares an extracted name with user data. The application identifies likely matching names to the extracted name. The OCR application breaks the extracted name into one or more series of segments and compares the segments from the extracted name to segments from the stored names. The OCR application determines an edit distance between the extracted name and each potentially matching stored name. If the edit distance is below a configured threshold then the OCR application revises the extracted name to match the identified stored name. The refined name is presented to the user for verification.
Abstract:
Methods, systems, and apparatus, including computer program products, for generating data for annotating images automatically. In one aspect, a method includes receiving an input image, identifying one or more nearest neighbor images of the input image from among a collection of images, in which each of the one or more nearest neighbor images is associated with a respective one or more image labels, assigning a plurality of image labels to the input image, in which the plurality of image labels are selected from the image labels associated with the one or more nearest neighbor images, and storing in a data repository the input image having the assigned plurality of image labels. In another aspect, a method includes assigning a single image label to the input image, in which the single image label is selected from labels associated with multiple ranked nearest neighbor images.
Abstract:
Comparing extracted card data from a continuous scan comprises receiving, by one or more computing devices, a digital scan of a card; obtaining a plurality of images of the card from the digital scan of the physical card; performing an optical character recognition algorithm on each of the plurality of images; comparing results of the application of the optical character recognition algorithm for each of the plurality of images; determining if a configured threshold of the results for each of the plurality of images match each other; and verifying the results when the results for each of the plurality of images match each other. Threshold confidence level for the extracted card data can be employed to determine the accuracy of the extraction. Data is further extracted from blended images and three-dimensional models of the card. Embossed text and holograms in the images may be used to prevent fraud.
Abstract:
Embodiments herein provide computer-implemented techniques for allowing a user computing device to extract financial card information using optical character recognition ("OCR"). Extracting financial card information may be improved by applying various classifiers and other transformations to the image data. For example, applying a linear classifier to the image to determine digit locations before applying the OCR algorithm allows the user computing device to use less processing capacity to extract accurate card data. The OCR application may train a classifier to use the wear patterns of a card to improve OCR algorithm performance. The OCR application may apply a linear classifier and then a nonlinear classifier to improve the performance and the accuracy of the OCR algorithm. The OCR application uses the known digit patterns used by typical credit and debit cards to improve the accuracy of the OCR algorithm.
Abstract:
A video demographics analysis system selects a training set of videos to use to correlate viewer demographics and video content data. The video demographics analysis system extracts demographic data from viewer profiles related to videos in the training set and creates a set of demographic distributions, and also extracts video data from videos in the training set. The video demographics analysis system correlates the viewer demographics with the video data of videos viewed by that viewer. Using the prediction model produced by the machine learning process, a new video about which there is no a priori knowledge can be associated with a predicted demographic distribution specifying probabilities of the video appealing to different types of people within a given demographic category, such as people of different ages within an age demographic category.
Abstract:
Providing improved card art for display comprises receiving, by one or more computing devices, an image of a card and performing an image recognition algorithm on the image. The computing device identifies images represented on the card image and comparing the identified images to an image database. The computing device determines a standard card art image associated with the identified image based at least in part on the comparison and associates the standard card art image with an account of a user, the account being associated with the card in the image. The computing device displays the standard card art as a representation of the account.
Abstract:
Extracting financial card information with relaxed alignment comprises a method to receive an image of a card (205), determine one or more edge finder zones in locations of the image, and identify lines in the one or more edge finder zones (210). The method further identifies one or more quadrilaterals formed by intersections of extrapolations of the identified lines, determines an aspect ratio of the one or more quadrilateral, and compares the determined aspect ratios of the quadrilateral to an expected aspect ratio (215). The method then identifies a quadrilateral that matches the expected aspect ratio (220) and performs an optical character recognition algorithm on the rectified model (230). A similar method is performed on multiple cards in an image. The results of the analysis of each of the cards are compared to improve accuracy of the data.
Abstract:
Extracting card data comprises receiving, by one or more computing devices, a digital image of a card; perform an image recognition process on the digital representation of the card; identifying an image in the digital representation of the card; comparing the identified image to an image database comprising a plurality of images and determining that the identified image matches a stored image in the image database; determining a card type associated with the stored image and associating the card type with the card based on the determination that the identified image matches the stored image; and performing a particular optical character recognition algorithm on the digital representation of the card, the particular optical character recognition algorithm being based on the determined card type. Another example uses an issuer identification number to improve data extraction. Another example compares extracted data with user data to improve accuracy.