Accurate text classification through selective use of image data
    1.
    发明授权
    Accurate text classification through selective use of image data 有权
    通过选择性使用图像数据来准确地进行文本分类

    公开(公告)号:US08768050B2

    公开(公告)日:2014-07-01

    申请号:US13158484

    申请日:2011-06-13

    IPC分类号: G06K9/62

    摘要: Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest.

    摘要翻译: 产品图像与文本描述结合使用,以改进产品分类。 通过结合来自与产品相关的文本和图像描述的提示,实现在基于网络的商业搜索的上下文中增强了产品描述分类的精度和回收。 几个实现旨在改进那些仅文本方法最不可靠的领域。 例如,在简短的文本产品描述使用与多个不同类别重叠的词汇的情况下,多个实现使用图像信号来补充文本分类器并改进整体产品分类。 其他实现涉及使用文本和图像“训练集”来改进自动分类器,包括纯文本分类器。 某些实现也针对学习一些三维图像分类器,仅针对文本信号的“混淆类别”,以改善文本分类最弱的特定领域。

    ACCURATE TEXT CLASSIFICATION THROUGH SELECTIVE USE OF IMAGE DATA
    2.
    发明申请
    ACCURATE TEXT CLASSIFICATION THROUGH SELECTIVE USE OF IMAGE DATA 有权
    通过选择性使用图像数据的精确文本分类

    公开(公告)号:US20120314941A1

    公开(公告)日:2012-12-13

    申请号:US13158484

    申请日:2011-06-13

    IPC分类号: G06K9/62

    摘要: Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest.

    摘要翻译: 产品图像与文本描述结合使用,以改进产品分类。 通过结合来自与产品相关的文本和图像描述的提示,实现在基于网络的商业搜索的上下文中增强了产品描述分类的精度和回收。 几个实现旨在改进那些仅文本方法最不可靠的领域。 例如,在简短的文本产品描述使用与多个不同类别重叠的词汇的情况下,多个实现使用图像信号来补充文本分类器并改进整体产品分类。 其他实现涉及使用文本和图像训练集来改进自动分类器,包括纯文本分类器。 某些实现也针对学习一些三维图像分类器,仅针对混淆文本信号的类别,以改进文本分类最弱的特定区域。

    MATCHING TEXT TO IMAGES
    3.
    发明申请
    MATCHING TEXT TO IMAGES 有权
    匹配文字到图像

    公开(公告)号:US20120163707A1

    公开(公告)日:2012-06-28

    申请号:US12979375

    申请日:2010-12-28

    IPC分类号: G06K9/62

    摘要: Text in web pages or other text documents may be classified based on the images or other objects within the webpage. A system for identifying and classifying text related to an object may identify one or more web pages containing the image or similar images, determine topics from the text of the document, and develop a set of training phrases for a classifier. The classifier may be trained and then used to analyze the text in the documents. The training set may include both positive examples and negative examples of text taken from the set of documents. A positive example may include captions or other elements directly associated with the object, while negative examples may include text taken from the documents, but from a large distance from the object. In some cases, the system may iterate on the classification process to refine the results.

    摘要翻译: 可以基于网页内的图像或其他对象来对网页或其他文本文档中的文本进行分类。 用于识别和分类与对象相关的文本的系统可以识别包含图像或类似图像的一个或多个网页,从文档的文本确定主题,并且为分类器开发一组训练短语。 可以对分类器进行训练,然后用于分析文档中的文本。 训练集可能包括从该组文件中获取的文本的正面例子和否定的例子。 正面例子可以包括与对象直接相关联的标题或其他元素,而负面示例可以包括从文档中取出的文本,但是距离对象很远的距离。 在某些情况下,系统可能会对分类过程进行迭代以优化结果。

    Matching text to images
    5.
    发明授权

    公开(公告)号:US08503769B2

    公开(公告)日:2013-08-06

    申请号:US12979375

    申请日:2010-12-28

    IPC分类号: G06K9/62 G06F17/00

    摘要: Text in web pages or other text documents may be classified based on the images or other objects within the webpage. A system for identifying and classifying text related to an object may identify one or more web pages containing the image or similar images, determine topics from the text of the document, and develop a set of training phrases for a classifier. The classifier may be trained and then used to analyze the text in the documents. The training set may include both positive examples and negative examples of text taken from the set of documents. A positive example may include captions or other elements directly associated with the object, while negative examples may include text taken from the documents, but from a large distance from the object. In some cases, the system may iterate on the classification process to refine the results.

    Vouching for user account using social networking relationship
    6.
    发明授权
    Vouching for user account using social networking relationship 有权
    使用社交网络关系为用户帐户提供支持

    公开(公告)号:US08745738B2

    公开(公告)日:2014-06-03

    申请号:US13350806

    申请日:2012-01-15

    摘要: Trusted user accounts of an application provider are determined. Graphs, such as trees, are created with each node corresponding to a trusted account. Each of the nodes is associated with a vouching quota, or the nodes may share a vouching quota. Untrusted user accounts are determined. For each of these untrusted accounts, a trusted user account that has a social networking relationship is determined. If the node corresponding to the trusted user account has enough vouching quota to vouch for the untrusted user account, then the quota is debited, a node is added for the untrusted user account to the graph, and the untrusted user account is vouched for. If not, available vouching quota may be borrowed from other nodes in the graph.

    摘要翻译: 确定应用程序提供程序的可信用户帐户。 使用与受信任帐户相对应的每个节点来创建诸如树之类的图形。 每个节点都与一个备份配额相关联,或者节点可以共享一个备份配额。 不信任的用户帐户被确定。 对于每个这些不受信任的帐户,确定具有社交网络关系的可信用户帐户。 如果与受信任用户帐户相对应的节点具有足够的备用配额来保证不受信任的用户帐户,则会将配额扣除,为图中不可信任的用户帐户添加一个节点,并为不受信任的用户帐户进行验证。 如果不是,可以从图中的其他节点借用可用的支票配额。

    OPTIMIZING DATA PARTITIONING FOR DATA-PARALLEL COMPUTING
    7.
    发明申请
    OPTIMIZING DATA PARTITIONING FOR DATA-PARALLEL COMPUTING 有权
    优化用于数据并行计算的数据分区

    公开(公告)号:US20130152057A1

    公开(公告)日:2013-06-13

    申请号:US13325049

    申请日:2011-12-13

    IPC分类号: G06F9/44

    CPC分类号: G06F8/453

    摘要: A data partitioning plan is automatically generated that—given a data-parallel program and a large input dataset, and without having to first run the program on the input dataset—substantially optimizes performance of the distributed execution system that explicitly measures and infers various properties of both data and computation to perform cost estimation and optimization. Estimation may comprise inferring the cost of a candidate data partitioning plan, and optimization may comprise generating an optimal partitioning plan based on the estimated costs of computation and input/output.

    摘要翻译: 自动生成数据分区计划,给定数据并行程序和大型输入数据集,无需首先在输入数据集上运行程序,从而大大优化了分布式执行系统的性能,从而明确地测量和推断出 数据和计算都要进行成本估算和优化。 估计可以包括推断候选数据分割计划的成本,并且优化可以包括基于计算和输入/输出的估计成本来生成最优分割计划。

    Partition min-hash for partial-duplicate image determination
    8.
    发明授权
    Partition min-hash for partial-duplicate image determination 有权
    部分重复图像确定的分区最小散列

    公开(公告)号:US08452106B2

    公开(公告)日:2013-05-28

    申请号:US12729250

    申请日:2010-03-23

    IPC分类号: G06K9/66

    CPC分类号: G06K9/6202 G06K9/4642

    摘要: Images in a database or collection of images are each divided into multiple partitions with each partition corresponding to an area of an image. The partitions in an image may overlap with each other. Min-hash sketches are generated for each of the partitions and stored with the images. A user may submit an image and request that an image that is a partial match for the submitted image be located in the image collection. The submitted image is similarly divided into partitions and min-hash sketches are generated from the partitions. The min-hash sketches are compared with the stored min-hash sketches for matches, and images having partitions whose sketches are matches are returned as partial matching images.

    摘要翻译: 数据库或图像集合中的图像被分成多个分区,每个分区对应于图像的区域。 图像中的分区可能会彼此重叠。 为每个分区生成最小散列草图,并与图像一起存储。 用户可以提交图像并请求作为所提交图像的部分匹配的图像位于图像集合中。 提交的图像类似地划分为分区,并且从分区生成最小哈希草图。 将最小哈希草图与存储的最小哈希草图进行比较,并将具有其草图匹配的分区的图像作为部分匹配图像返回。

    User interface for three-dimensional navigation
    9.
    发明授权
    User interface for three-dimensional navigation 有权
    三维导航用户界面

    公开(公告)号:US08276088B2

    公开(公告)日:2012-09-25

    申请号:US11827530

    申请日:2007-07-11

    IPC分类号: G06F3/048

    摘要: The present invention uses invisible junctions which are a set of local features unique to every page of the electronic document to match the captured image to a part of an electronic document. The present invention includes: an image capture device, a feature extraction and recognition system and database. When an electronic document is printed, the feature extraction and recognition system captures an image of the document page. The features in the captured image are then extracted, indexed and stored in the database. Given a query image, usually a small patch of some document page captured by a low resolution image capture device, the features in the query image are extracted and compared against those stored in the database to identify the query image. The present invention also includes methods for recognizing and tracking the viewing region and look at point corresponding to the input query image. This information is combined with a rendering of the original input document to generate a new graphical user interface to the user. This user interface can be displayed on a conventional browser or even on the display of an image capture device.

    摘要翻译: 本发明使用作为电子文档的每一页特有的一组局部特征的不可见结,以将捕获的图像与电子文档的一部分相匹配。 本发明包括:图像捕获装置,特征提取和识别系统和数据库。 当打印电子文档时,特征提取和识别系统捕获文档页面的图像。 然后将捕获的图像中的特征提取,索引并存储在数据库中。 给定查询图像,通常是由低分辨率图像捕获设备捕获的一些文档页面的小补丁,提取查询图像中的特征并将其与存储在数据库中的特征进行比较以识别查询图像。 本发明还包括用于识别和跟踪观看区域并查看与输入查询图像相对应的点的方法。 该信息与原始输入文档的呈现相结合,以向用户生成新的图形用户界面。 该用户界面可以显示在常规浏览器上,甚至可以在图像捕获设备的显示器上显示。

    Synthetic image and video generation from ground truth data
    10.
    发明授权
    Synthetic image and video generation from ground truth data 有权
    地面真相数据的合成图像和视频生成

    公开(公告)号:US08238609B2

    公开(公告)日:2012-08-07

    申请号:US13168638

    申请日:2011-06-24

    IPC分类号: G06K9/00

    摘要: A system and a method are disclosed for generating video. Object information is received. A path of motion of the object relative to a reference point is generated. A series of images and ground for a reference frame are generated from the ground truth and the generated path. A system and a method are disclosed for generating an image. Object information is received. Image data and ground truth may be generated using position, the image description, the camera characteristics, and image distortion parameters. A positional relationship between the document and a reference point is determined. An image of the document and ground truth are generated from the object information and the positional relationship and in response to user specified environment of the document.

    摘要翻译: 公开了一种用于产生视频的系统和方法。 收到对象信息。 生成对象相对于参考点的运动路径。 从地面真值和生成的路径生成一系列用于参考帧的图像和地面。 公开了一种用于生成图像的系统和方法。 收到对象信息。 可以使用位置,图像描述,相机特性和图像失真参数来生成图像数据和地面真实。 确定文件与参考点之间的位置关系。 从对象信息和位置关系以及响应于用户指定的文档环境生成文档和地面真值的图像。