Thread-Based Incremental Web Forum Crawling
    41.
    发明申请
    Thread-Based Incremental Web Forum Crawling 审中-公开
    基于线程的增量Web论坛抓取

    公开(公告)号:US20100205168A1

    公开(公告)日:2010-08-12

    申请号:US12368768

    申请日:2009-02-10

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: The incremental web forum crawling technique described herein is a web forum crawling technique that employs a thread-wise strategy that takes into account thread-level statistics, for example, the number of replies and the frequency of replies, to estimate the activity trend of each thread. To extract such statistical information, the technique employs a simple yet very robust approach to extract the timestamp of each post in a discussion thread. It also employs a regression model to predict the time of the next post for each thread.

    摘要翻译: 本文描述的增量网页论坛抓取技术是一种网络论坛抓取技术,其采用考虑到线程级统计的线程策略,例如回复次数和回复频率,以估计每个 线。 为了提取这种统计信息,该技术采用一种简单而非常鲁棒的方法来提取讨论线程中每个帖子的时间戳。 它还采用回归模型来预测每个线程的下一个帖子的时间。

    User interface for viewing clusters of images
    42.
    发明授权
    User interface for viewing clusters of images 有权
    用于查看图像群集的用户界面

    公开(公告)号:US07644373B2

    公开(公告)日:2010-01-05

    申请号:US11337945

    申请日:2006-01-23

    IPC分类号: G06F3/048 G06F7/00

    摘要: A method and system for providing a user interface for presenting images of clusters of an image search result is provided. The user interface system displays the search result in a cluster/view form using a cluster panel and a view panel. The cluster panel contains a cluster area for each cluster. The view panel may contain thumbnails of images of the search result in a list view or a mix view. When a user selects a cluster area from the cluster panel, the user interface system displays a list view of thumbnails for that cluster in the view panel. The user interface system may display a thumbnail list near a cluster area of the cluster panel. The thumbnail list contains mini-thumbnails of the images of the selected cluster. The user interface system may also display a detail view of an image in the view panel when a user selects an image.

    摘要翻译: 提供了一种用于提供用于呈现图像搜索结果的聚类的图像的用户界面的方法和系统。 用户界面系统使用群集面板和视图面板将搜索结果显示在群集/视图窗体中。 集群面板包含每个集群的集群区域。 视图面板可以在列表视图或混合视图中包含搜索结果的图像的缩略图。 当用户从群集面板中选择群集区域时,用户界面系统会在视图面板中显示该群集的缩略图列表视图。 用户界面系统可以在集群面板的集群区域附近显示缩略图列表。 缩略图列表包含所选集群的图像的迷你缩略图。 当用户选择图像时,用户界面系统还可以在视图面板中显示图像的细节视图。

    Systems and methods for indexing and retrieving images
    43.
    发明申请
    Systems and methods for indexing and retrieving images 失效
    索引和检索图像的系统和方法

    公开(公告)号:US20050100221A1

    公开(公告)日:2005-05-12

    申请号:US10703300

    申请日:2003-11-07

    摘要: Systems and methods for indexing and retrieving images are described herein. The systems and methods analyze an image to determine its texture moments. The pixels of the image are converted to gray scale. Textural attributes of the pixels are determined. The textural attributes are associated with the local texture of the pixels and are derived from coefficients of Discrete Fourier Transform associated with the pixels. Statistical values associated with the textural attributes of the pixels are calculated. The texture moments of the image are determined from the statistical value.

    摘要翻译: 本文描述了索引和检索图像的系统和方法。 系统和方法分析图像以确定其纹理矩。 图像的像素被转换成灰度级。 确定像素的纹理属性。 纹理属性与像素的局部纹理相关联,并且从与像素相关联的离散傅立叶变换的系数导出。 计算与像素的纹理属性相关联的统计值。 根据统计值确定图像的纹理矩。

    Text to image translation
    44.
    发明授权

    公开(公告)号:US09678992B2

    公开(公告)日:2017-06-13

    申请号:US13110282

    申请日:2011-05-18

    IPC分类号: G06F17/30 G06F15/18

    摘要: Techniques are described for online real time text to image translation suitable for virtually any submitted query. Semantic classes and associated analogous items for each of the semantic classes are determined for the submitted query. One or more requests are formulated that are associated with analogous items. The requests are used to obtain web based images and associated surrounding text. The web based images are used to obtain associated near-duplicate images. The surrounding text of images is analyzed to create high-quality text associated with each semantic class of the submitted query. One or more query dependent classifiers are trained online in real time to remove noisy images. A scoring function is used to score the images. The images with the highest score are returned as a query response.

    Identification of duplicates within an image space
    45.
    发明授权
    Identification of duplicates within an image space 有权
    识别图像空间中的重复项

    公开(公告)号:US08995771B2

    公开(公告)日:2015-03-31

    申请号:US13459777

    申请日:2012-04-30

    摘要: Implementations for identifying duplicate images in an image space are described. An image space is partitioned into a plurality of coarse clusters based on signatures of the images within the image space. The signatures are determined from compact descriptors of the images. Refined clusters that include one or more images of an individual coarse cluster are created based on pair-wise comparisons of the compact descriptors of images in the coarse cluster, and the refined clusters are identified as sets of duplicate images. The refined clusters are grown by searching in similar coarse clusters for images to add to the refined clusters.

    摘要翻译: 描述用于在图像空间中识别重复图像的实现。 基于图像空间内的图像的签名,图像空间被分割成多个粗簇。 签名由图像的紧凑描述符确定。 基于粗略集群中的图像的紧凑描述符的成对比较,创建包括单个粗集群的一个或多个图像的精细集群,并且将精细集群标识为重复图像的集合。 通过在类似的粗簇中搜索图像以增加到精细簇,生长精细簇。

    Text to Image Translation
    46.
    发明申请
    Text to Image Translation 有权
    文本到图像翻译

    公开(公告)号:US20120296897A1

    公开(公告)日:2012-11-22

    申请号:US13110282

    申请日:2011-05-18

    IPC分类号: G06F17/30

    摘要: Techniques are described for online real time text to image translation suitable for virtually any submitted query. Semantic classes and associated analogous items for each of the semantic classes are determined for the submitted query. One or more requests are formulated that are associated with analogous items. The requests are used to obtain web based images and associated surrounding text. The web based images are used to obtain associated near-duplicate images. The surrounding text of images is analyzed to create high-quality text associated with each semantic class of the submitted query. One or more query dependent classifiers are trained online in real time to remove noisy images. A scoring function is used to score the images. The images with the highest score are returned as a query response.

    摘要翻译: 描述了在线实时文本的技术,用于几乎任何提交的查询的图像翻译。 为所提交的查询确定每个语义类的语义类和相关联的类似项。 制定与类似项目相关联的一个或多个请求。 这些请求用于获取基于网络的图像和相关的周围文本。 基于网络的图像用于获取相关的近似重复图像。 分析图像的周围文本,以创建与提交的查询的每个语义类相关联的高质量文本。 一个或多个查询相关分类器在线实时训练,以消除嘈杂的图像。 评分功能用于评分图像。 具有最高分数的图像作为查询响应返回。

    WEB FORUM CRAWLING USING SKELETAL LINKS
    47.
    发明申请
    WEB FORUM CRAWLING USING SKELETAL LINKS 有权
    使用SKELETAL链接的WEB FORUM CRAWLING

    公开(公告)号:US20120117052A1

    公开(公告)日:2012-05-10

    申请号:US13351952

    申请日:2012-01-17

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A method and system for identifying informative links of a web site for use in crawling the web site is provided. A forum crawler analyzes sample web pages of a web forum to identify informative links and then crawls the web forum by following links determined to be informative and not following other links. The forum crawler system determines whether links are informative based on whether they are part of the overall structure of the web site or are used to select sequential information that has been split onto multiple web pages.

    摘要翻译: 提供了一种用于识别用于爬行网站的网站的信息链接的方法和系统。 论坛搜寻器分析网页论坛的示例网页,以识别信息链接,然后通过确定为信息而不是遵循其他链接的链接抓取网页论坛。 论坛搜寻器系统基于它们是网站的整体结构的一部分还是用于选择分割到多个网页上的顺序信息来确定链接是否具有信息性。

    BUILDING A PERSON PROFILE DATABASE
    48.
    发明申请
    BUILDING A PERSON PROFILE DATABASE 有权
    建立个人资料数据库

    公开(公告)号:US20120114197A1

    公开(公告)日:2012-05-10

    申请号:US12942284

    申请日:2010-11-09

    IPC分类号: G06K9/00

    摘要: Names of entities, such as people, in an image may be identified automatically. Visually similar images of entities are retrieved, including text proximate to the visually similar images. The collected text is mined for names of entities, and the detected names are analyzed. A name may be associated with the entity in the image, based on the analysis.

    摘要翻译: 可以自动识别图像中的实体(例如人物)的名称。 检索实体相似的实体图像,包括靠近视觉相似图像的文本。 收集的文本用于实体名称,并分析检测到的名称。 基于分析,名称可能与图像中的实体相关联。

    Advertising Method for Image Search
    50.
    发明申请
    Advertising Method for Image Search 审中-公开
    图像搜索的广告方法

    公开(公告)号:US20100169178A1

    公开(公告)日:2010-07-01

    申请号:US12344295

    申请日:2008-12-26

    IPC分类号: G06Q30/00 G06F17/30

    摘要: A method for advertising in response to an image search. One or more keywords may be received. The keywords may be for searching one or more images on the network. The images may be retrieved based on the keywords. One or more advertisements may be selected based on a first visual content of the images and a second visual content of the one or more advertisements. The one or more of the advertisements may be displayed.

    摘要翻译: 一种响应于图像搜索进行广告的方法。 可以接收一个或多个关键字。 这些关键字可以用于搜索网络上的一个或多个图像。 可以基于关键字检索图像。 可以基于图像的第一视觉内容和一个或多个广告的第二视觉内容来选择一个或多个广告。 可以显示一个或多个广告。