-
1.
公开(公告)号:US07594013B2
公开(公告)日:2009-09-22
申请号:US11136029
申请日:2005-05-24
申请人: Jian Wang , Hua-Jun Zeng , Chenxi Lin , Zheng Chen , Benyu Zhang , Bing Sun
发明人: Jian Wang , Hua-Jun Zeng , Chenxi Lin , Zheng Chen , Benyu Zhang , Bing Sun
IPC分类号: G06F15/173
CPC分类号: G06F17/3089
摘要: A method of creating a personal home page containing information of interest assembled from various web sites. The method includes the partitioning of web pages into web blocks. Users may collect various web blocks from different web pages and utilize those web blocks to define the dynamic personal homepage. In addition, the web blocks may be tracked to update content in the personal home page based on corresponding changes in the original web page.
摘要翻译: 一种创建个人主页的方法,该个人主页包含从各种网站组装的感兴趣的信息。 该方法包括将网页划分成网页块。 用户可以从不同的网页收集各种网页块,并利用这些网页块定义动态个人主页。 此外,可以基于原始网页中的相应变化来跟踪网页块以更新个人主页中的内容。
-
公开(公告)号:US20070005649A1
公开(公告)日:2007-01-04
申请号:US11173098
申请日:2005-07-01
申请人: Jian Wang , Fengping Zeng , Hua-Jun Zeng , Benyu Zhang , Zheng Chen , Chenxi Lin , Bing Sun
发明人: Jian Wang , Fengping Zeng , Hua-Jun Zeng , Benyu Zhang , Zheng Chen , Chenxi Lin , Bing Sun
IPC分类号: G06F17/00
CPC分类号: G06F16/957
摘要: The invention provides a method of creating contextual titles for web pages or documents. The method includes the extracting of phrases from a web page or document. The phrases are evaluated for use as contextual titles for the web page or document. The contextual title is utilized to access the web page or document by users.
摘要翻译: 本发明提供了一种为网页或文档创建上下文标题的方法。 该方法包括从网页或文档中提取短语。 这些短语被评估用作网页或文档的上下文标题。 使用上下文标题来访问用户的网页或文档。
-
公开(公告)号:US20060271834A1
公开(公告)日:2006-11-30
申请号:US11136029
申请日:2005-05-24
申请人: Jian Wang , Hua-Jun Zeng , Chenxi Lin , Zheng Chen , Benyu Zhang , Bing Sun
发明人: Jian Wang , Hua-Jun Zeng , Chenxi Lin , Zheng Chen , Benyu Zhang , Bing Sun
IPC分类号: G06F17/00
CPC分类号: G06F17/3089
摘要: The invention provides a method of creating a personal home page containing information of interest assembled from various web sites. The method includes the partitioning of web pages into web blocks. Users may collect various web blocks from different web pages and utilize those web blocks to define the dynamic personal homepage. In addition, the web blocks may be tracked to update content in the personal home page based on corresponding changes in the original web page.
摘要翻译: 本发明提供了一种创建包含从各种网站组装的感兴趣的信息的个人主页的方法。 该方法包括将网页划分成网页块。 用户可以从不同的网页收集各种网页块,并利用这些网页块定义动态个人主页。 此外,可以基于原始网页中的相应变化来跟踪网页块以更新个人主页中的内容。
-
公开(公告)号:US07844449B2
公开(公告)日:2010-11-30
申请号:US11392763
申请日:2006-03-30
申请人: Chenxi Lin , Jie Han , Guirong Xue , Hua-Jun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Chenxi Lin , Jie Han , Guirong Xue , Hua-Jun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
IPC分类号: G06F17/27
CPC分类号: G06F17/2785
摘要: A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.
摘要翻译: 公开了一种可扩展的双向可伸缩概率潜在语义分析(PLSA)方法,其可以比传统的PLSA更有效地执行,在某些情况下可以更准确地执行,特别是在提供大型和/或稀疏数据集用于分析的情况下。 改进的方法可以大大降低培训PLSA模型的存储和/或计算成本。 在双路方法的第一遍中,对象被聚集成组,并且PLSA在组而不是原始的单个对象上执行。 在第二遍中,获得给定对象的潜在类的条件概率。 这可以通过扩展第一遍的训练结果来完成。 在第二遍期间,识别每个对象最可能的潜在类。
-
公开(公告)号:US20070239554A1
公开(公告)日:2007-10-11
申请号:US11377480
申请日:2006-03-16
申请人: Chenxi Lin , Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Benyu Zhang , Jian Wang
发明人: Chenxi Lin , Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Benyu Zhang , Jian Wang
IPC分类号: G06Q30/00
CPC分类号: G06F17/30867 , G06Q30/0212 , G06Q30/0601 , G06Q30/0631
摘要: Methods for determining a predictive rating are disclosed. In an embodiment, an active user is compared to a set of clusters. One or more of the clusters are determined to be most similar to the active user. From the one or more clusters, K users are determined to be most similar to the active user. Prior ratings for an item by the K users may be used to predict a rating for the item for the active user.
摘要翻译: 公开了确定预测等级的方法。 在一个实施例中,将活动用户与一组集群进行比较。 集群中的一个或多个被确定为与活动用户最相似。 从一个或多个集群中,K个用户被确定为与活动用户最相似。 K用户对某项目的先前评级可用于预测活动用户的项目评级。
-
公开(公告)号:US20070239431A1
公开(公告)日:2007-10-11
申请号:US11392763
申请日:2006-03-30
申请人: Chenxi Lin , Jie Han , Guirong Xue , Hua-Jun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Chenxi Lin , Jie Han , Guirong Xue , Hua-Jun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
IPC分类号: G06F17/27
CPC分类号: G06F17/2785
摘要: A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.
摘要翻译: 公开了一种可扩展的双向可伸缩概率潜在语义分析(PLSA)方法,其可以比传统的PLSA更有效地执行,在某些情况下可以更准确地执行,特别是在提供大数据集和/或稀疏数据集用于分析的情况下。 改进的方法可以大大降低培训PLSA模型的存储和/或计算成本。 在双路方法的第一遍中,对象被聚集成组,并且PLSA在组而不是原始的单个对象上执行。 在第二遍中,获得给定对象的潜在类的条件概率。 这可以通过扩展第一遍的训练结果来完成。 在第二遍期间,识别每个对象最可能的潜在类。
-
公开(公告)号:US20070239553A1
公开(公告)日:2007-10-11
申请号:US11377130
申请日:2006-03-16
申请人: Chenxi Lin , Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Benyu Zhang , Jian Wang
发明人: Chenxi Lin , Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Benyu Zhang , Jian Wang
IPC分类号: G06Q30/00
CPC分类号: G06Q30/0623 , G06F16/9535 , G06Q30/0242 , G06Q30/0274
摘要: In an embodiment, a method of predicting an active user's rating for an item is disclosed. A database of users may be sorted into clusters. The data associated with the users in each cluster may be smoothed to filling in ratings for items that the users have not personally rated. An active user may then be compared to a set of users, where the set may be all or some portion of the database, to determine the K users that are most similar to the active user. The ratings of the K users regarding the item may be used to predict the active user's rating for the item. In an embodiment, the rating of each of the K users is assigned a confidence value associated with whether the user personally rated the item or if the rating was generated by the data smoothing process.
摘要翻译: 在一个实施例中,公开了一种用于预测项目的活跃用户评级的方法。 可以将用户的数据库分类为群集。 可以平滑与每个群集中的用户相关联的数据,以填充用户未被评估的项目的评级。 然后可以将活动用户与一组用户进行比较,其中该集合可以是数据库的全部或部分,以确定与活动用户最相似的K个用户。 关于该项目的K个用户的评级可以用于预测该项目的活动用户的评级。 在一个实施例中,每个K个用户的评级被分配与用户个人评价该项目相关联的置信度值,或者如果该评级是由数据平滑处理产生的。
-
公开(公告)号:US20080016087A1
公开(公告)日:2008-01-17
申请号:US11456753
申请日:2006-07-11
申请人: Benyu Zhang , Chenxi Lin , Hua-Jun Zeng , Jian Wang , Ke Tang , Zheng Chen
发明人: Benyu Zhang , Chenxi Lin , Hua-Jun Zeng , Jian Wang , Ke Tang , Zheng Chen
IPC分类号: G06F7/00
CPC分类号: G06F17/30864 , Y10S707/99935
摘要: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.
摘要翻译: 本发明提供了一种在网页上交互地爬行数据记录的方法。 用户可以在网页上选择感兴趣的各种数据记录,以生成在同一网页或不同网页上搜索类似数据项的模板。 可以使用树匹配算法来比较和提取与生成的模板匹配的数据。
-
公开(公告)号:US08738467B2
公开(公告)日:2014-05-27
申请号:US11377480
申请日:2006-03-16
申请人: Chenxi Lin , Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Benyu Zhang , Jian Wang
发明人: Chenxi Lin , Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Benyu Zhang , Jian Wang
IPC分类号: G06Q30/00
CPC分类号: G06F17/30867 , G06Q30/0212 , G06Q30/0601 , G06Q30/0631
摘要: Methods for determining a predictive rating are disclosed. In an embodiment, an active user is compared to a set of clusters. One or more of the clusters are determined to be most similar to the active user. From the one or more clusters, K users are determined to be most similar to the active user. Prior ratings for an item by the K users may be used to predict a rating for the item for the active user.
摘要翻译: 公开了确定预测等级的方法。 在一个实施例中,将活动用户与一组集群进行比较。 集群中的一个或多个被确定为与活动用户最相似。 从一个或多个集群中,K个用户被确定为与活动用户最相似。 K用户对某项目的先前评级可用于预测活动用户的项目评级。
-
公开(公告)号:US07555480B2
公开(公告)日:2009-06-30
申请号:US11456753
申请日:2006-07-11
申请人: Benyu Zhang , Chenxi Lin , Hua-Jun Zeng , Jian Wang , Ke Tang , Zheng Chen
发明人: Benyu Zhang , Chenxi Lin , Hua-Jun Zeng , Jian Wang , Ke Tang , Zheng Chen
CPC分类号: G06F17/30864 , Y10S707/99935
摘要: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.
摘要翻译: 本发明提供了一种在网页上交互地爬行数据记录的方法。 用户可以在网页上选择感兴趣的各种数据记录,以生成在同一网页或不同网页上搜索类似数据项的模板。 树匹配算法可用于比较和提取与生成的模板匹配的数据。
-
-
-
-
-
-
-
-
-