-
公开(公告)号:US08458115B2
公开(公告)日:2013-06-04
申请号:US12796303
申请日:2010-06-08
申请人: Rui Cai , Qiang Hao , Changhu Wang , Rong Xiao , Lei Zhang
发明人: Rui Cai , Qiang Hao , Changhu Wang , Rong Xiao , Lei Zhang
CPC分类号: G06F17/30707
摘要: Described herein is a technology that facilitates efficient automated mining of topic-related aspects of user generated content based on automated analysis of the user generated content. Locations are automatically learned based on dividing documents into document segments, and decomposing the segments into local topics and global topics. Techniques described herein include, for example, computer annotating travelogues with learned tags, performing topic learning to obtain an interest model, and performing location matching based on the interest model.
摘要翻译: 这里描述了一种技术,其有助于基于对用户生成的内容的自动化分析来有效地自动挖掘用户生成的内容的主题相关方面。 根据将文档分割成文档段,自动学习位置,并将段分解为本地主题和全局主题。 本文描述的技术包括例如计算机注释具有学习标签的旅行记录,执行主题学习以获得兴趣模型,以及基于兴趣模型执行位置匹配。
-
公开(公告)号:US08954425B2
公开(公告)日:2015-02-10
申请号:US12796345
申请日:2010-06-08
申请人: Rong Xiao , Qiang Hao , Changhu Wang , Rui Cai , Lei Zhang
发明人: Rong Xiao , Qiang Hao , Changhu Wang , Rui Cai , Lei Zhang
IPC分类号: G06F17/30
CPC分类号: G06F17/30867 , G06F17/30241
摘要: Described herein is a technology that facilitates efficient automated mining of topic-related aspects of user-generated content based on automated analysis of the user-generated content. Locations are automatically learned based on dividing documents into document segments, and decomposing the segments into local topics and global topics. Techniques are described that facilitate automatically extracting snippets. These techniques include, for example, computer annotating travelogues with learned tags and images, performing topic learning to obtain an interest model, performing location matching based on the interest model, calculating geographic and semantic relevance scores, ranking snippets based on the geographic and semantic relevance scores, and searching snippets with a “location+context term” query.
摘要翻译: 这里描述了一种技术,其有助于基于对用户生成的内容的自动化分析来有效地自动挖掘用户生成的内容的主题相关方面。 根据将文档分割成文档段,自动学习位置,并将段分解为本地主题和全局主题。 描述了便于自动提取代码段的技术。 这些技术包括例如计算机注释具有学习标签和图像的旅行记录,执行主题学习以获得兴趣模型,基于兴趣模型执行位置匹配,计算地理和语义相关性分数,基于地理和语义相关性来排序片段 分数和搜索带有“位置+上下文术语”查询的片段。
-
公开(公告)号:US20110302162A1
公开(公告)日:2011-12-08
申请号:US12796345
申请日:2010-06-08
申请人: Rong Xiao , Qiang Hao , Changhu Wang , Rui Cai , Lei Zhang
发明人: Rong Xiao , Qiang Hao , Changhu Wang , Rui Cai , Lei Zhang
IPC分类号: G06F17/30
CPC分类号: G06F17/30867 , G06F17/30241
摘要: Described herein is a technology that facilitates efficient automated mining of topic-related aspects of user-generated content based on automated analysis of the user-generated content. Locations are automatically learned based on dividing documents into document segments, and decomposing the segments into local topics and global topics. Techniques are described that facilitate automatically extracting snippets. These techniques include, for example, computer annotating travelogues with learned tags and images, performing topic learning to obtain an interest model, performing location matching based on the interest model, calculating geographic and semantic relevance scores, ranking snippets based on the geographic and semantic relevance scores, and searching snippets with a “location+context term” query.
摘要翻译: 这里描述了一种技术,其有助于基于对用户生成的内容的自动化分析来有效地自动挖掘用户生成的内容的主题相关方面。 根据将文档分割成文档段,自动学习位置,并将段分解为本地主题和全局主题。 描述了便于自动提取代码段的技术。 这些技术包括例如计算机注释具有学习标签和图像的旅行记录,执行主题学习以获得兴趣模型,基于兴趣模型执行位置匹配,计算地理和语义相关性分数,基于地理和语义相关性来排序片段 分数和搜索带有“位置+上下文术语”查询的片段。
-
公开(公告)号:US20110302124A1
公开(公告)日:2011-12-08
申请号:US12796303
申请日:2010-06-08
申请人: Rui Cai , Qiang Hao , Changhu Wang , Rong Xiao , Lei Zhang
发明人: Rui Cai , Qiang Hao , Changhu Wang , Rong Xiao , Lei Zhang
CPC分类号: G06F17/30707
摘要: Described herein is a technology that facilitates efficient automated mining of topic-related aspects of user generated content based on automated analysis of the user generated content. Locations are automatically learned based on dividing documents into document segments, and decomposing the segments into local topics and global topics. Techniques described herein include, for example, computer annotating travelogues with learned tags, performing topic learning to obtain an interest model, and performing location matching based on the interest model.
摘要翻译: 这里描述了一种技术,其有助于基于对用户生成的内容的自动化分析来有效地自动挖掘用户生成的内容的主题相关方面。 根据将文档分割成文档段,自动学习位置,并将段分解为本地主题和全局主题。 本文描述的技术包括例如计算机注释具有学习标签的旅行记录,执行主题学习以获得兴趣模型,以及基于兴趣模型执行位置匹配。
-
公开(公告)号:US08856129B2
公开(公告)日:2014-10-07
申请号:US13237142
申请日:2011-09-20
IPC分类号: G06F17/30
CPC分类号: G06F17/30864 , G06F17/3071
摘要: This document describes techniques that label text nodes of a seed site for each of a plurality of verticals. Once a seed site is labeled for a given vertical, the techniques extract features from the labeled text nodes of the seed site. The techniques learn vertical knowledge for the seed site based on the human labels and the extracted features, and adapt the learned vertical knowledge to a new web site to automatically and accurately identify attributes and extract attribute values targeted within a given vertical for structured web data extraction.
摘要翻译: 本文档描述了为多个垂直中的每一个标记种子位置的文本节点的技术。 一旦种子站点被标记为给定的垂直线,该技术从种子站点的标记的文本节点提取特征。 该技术基于人类标签和提取的特征学习种子站点的垂直知识,并将学习的垂直知识适应于新的网站,以自动准确地识别属性并提取针对特定垂直线的属性值,以进行结构化网络数据提取 。
-
公开(公告)号:US20140029856A1
公开(公告)日:2014-01-30
申请号:US13561718
申请日:2012-07-30
IPC分类号: G06K9/46
CPC分类号: G06K9/46 , G06K9/4676 , G06K9/469
摘要: The techniques discussed herein discover three-dimensional (3-D) visual phrases for an object based on a 3-D model of the object. The techniques then describe the 3-D visual phrases. Once described, the techniques use the 3-D visual phrases to detect the object in an image (e.g., object recognition).
摘要翻译: 本文讨论的技术基于对象的3-D模型发现对象的三维(3-D)视觉短语。 然后,技术描述3-D视觉短语。 一旦描述,这些技术使用3-D视觉短语来检测图像中的对象(例如,对象识别)。
-
公开(公告)号:US20130073514A1
公开(公告)日:2013-03-21
申请号:US13237142
申请日:2011-09-20
IPC分类号: G06F17/30
CPC分类号: G06F17/30864 , G06F17/3071
摘要: This document describes techniques that label text nodes of a seed site for each of a plurality of verticals. Once a seed site is labeled for a given vertical, the techniques extract features from the labeled text nodes of the seed site. The techniques learn vertical knowledge for the seed site based on the human labels and the extracted features, and adapt the learned vertical knowledge to a new web site to automatically and accurately identify attributes and extract attribute values targeted within a given vertical for structured web data extraction.
摘要翻译: 本文档描述了为多个垂直中的每一个标记种子位置的文本节点的技术。 一旦种子站点被标记为给定的垂直线,该技术从种子站点的标记的文本节点提取特征。 该技术基于人类标签和提取的特征学习种子站点的垂直知识,并将学习的垂直知识适应于新的网站,以自动准确地识别属性并提取针对特定垂直线的属性值,以进行结构化网络数据提取 。
-
公开(公告)号:US20120301014A1
公开(公告)日:2012-11-29
申请号:US13118282
申请日:2011-05-27
CPC分类号: G06K9/4676 , G06F16/583
摘要: Tools and techniques for learning to rank local interest points from images using a data-driven scale-invariant feature transform (SIFT) approach termed “Rank-SIFT” are described herein. Rank-SIFT provides a flexible framework to select stable local interest points using supervised learning. A Rank-SIFT application detects interest points, learns differential features, and implements ranking model training in the Gaussian scale space (GSS). In various implementations a stability score is calculated for ranking the local interest points by extracting features from the GSS and characterizing the local interest points based on the features being extracted from the GSS across images containing the same visual objects.
摘要翻译: 本文描述了使用称为Rank-SIFT的数据驱动的尺度不变特征变换(SIFT)方法学习从图像对本地兴趣点进行排名的工具和技术。 Rank-SIFT提供了一个灵活的框架,使用监督学习选择稳定的本地兴趣点。 Rank-SIFT应用程序检测兴趣点,学习差异特征,并实现高斯尺度空间(GSS)中的排名模型训练。 在各种实施方式中,通过从GSS提取特征并基于从包含相同视觉对象的图像从GSS提取的特征来表征局部兴趣点来计算稳定性分数以对局部兴趣点进行排名。
-
公开(公告)号:US08983201B2
公开(公告)日:2015-03-17
申请号:US13561718
申请日:2012-07-30
CPC分类号: G06K9/46 , G06K9/4676 , G06K9/469
摘要: The techniques discussed herein discover three-dimensional (3-D) visual phrases for an object based on a 3-D model of the object. The techniques then describe the 3-D visual phrases. Once described, the techniques use the 3-D visual phrases to detect the object in an image (e.g., object recognition).
摘要翻译: 本文讨论的技术基于对象的3-D模型发现对象的三维(3-D)视觉短语。 然后,技术描述3-D视觉短语。 一旦描述,这些技术使用3-D视觉短语来检测图像中的对象(例如,对象识别)。
-
公开(公告)号:US09495453B2
公开(公告)日:2016-11-15
申请号:US13114643
申请日:2011-05-24
申请人: Rui Cai , Xiaodong Fan , Lei Zhang
发明人: Rui Cai , Xiaodong Fan , Lei Zhang
CPC分类号: G06F17/30864 , G06F17/30705 , G06F17/30861 , G06F17/3089 , G06F17/30899
摘要: Web crawling polices are generated based on user web browsing statistics. User browsing statistics are aggregated at the granularity of resource identifier patterns (such as URL patterns) that denote groups of resources within a particular domain or website that share syntax at a certain level of granularity. The web crawl policies rank the resource identifier patterns according to their associated aggregated user browsing statistics. A crawl ordering defined by the web crawl polices is used to download and discover new resources within a domain or website.
摘要翻译: 基于用户网络浏览统计信息生成Web爬行策略。 用户浏览统计信息以资源标识符模式(例如URL模式)的粒度进行聚合,这些资源标识符模式表示特定域或网站中以特定粒度级别共享语法的资源组。 网络爬网策略根据其关联的聚合用户浏览统计信息对资源标识符模式进行排序。 由网络抓取策略定义的爬网排序用于下载和发现域或网站中的新资源。
-
-
-
-
-
-
-
-
-