Clustering based text classification
    1.
    发明申请
    Clustering based text classification 有权
    基于聚类的文本分类

    公开(公告)号:US20050234955A1

    公开(公告)日:2005-10-20

    申请号:US10921477

    申请日:2004-08-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/3071

    摘要: Systems and methods for clustering-based text classification are described. In one aspect text is clustered as a function of labeled data to generate cluster(s). The text includes the labeled data and unlabeled data. Expanded labeled data is then generated as a function of the cluster(s). The expanded label data includes the labeled data and at least a portion of unlabeled data. Discriminative classifier(s) are then trained based on the expanded labeled data and remaining ones of the unlabeled data.

    摘要翻译: 描述了基于聚类的文本分类的系统和方法。 在一个方面,文本被聚类为标记数据的函数以生成集群。 该文本包括标记数据和未标记数据。 然后根据集群生成扩展标签数据。 扩展的标签数据包括标记的数据和至少一部分未标记的数据。 然后基于扩展的标记数据和剩余的未标记数据来训练鉴别分类器。

    Web searching
    2.
    发明授权
    Web searching 失效
    网页搜索

    公开(公告)号:US07836058B2

    公开(公告)日:2010-11-16

    申请号:US12056302

    申请日:2008-03-27

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Mislabeled URLs are identified and corrected based upon a click relevance ranking computed from user data comprising user click information. The click relevance ranking is formed by applying a set of relevance ordering rules to user log data aggregated by query and URL and by mapping the results of the relevance ordering rules into a linear ordering. For a given query, the aggregated user log data comprises a relative total number of impression, a relative total number of clicks received and a rank associated with the query/URL pair at the time of the total number of impressions and total number of clicks received. The click relevance ranking is used to identify and correct mislabeled query/URL pairs of other rankings according to a number of disclosed methods.

    摘要翻译: 基于由包括用户点击信息的用户数据计算的点击相关性排名来识别和纠正错误标记的URL。 通过将一组相关性排序规则应用于通过查询和URL聚合的用户日志数据并将相关性排序规则的结果映射为线性排序来形成点击相关性排名。 对于给定的查询,聚合用户日志数据包括相对总曝光次数,接收的相对总点击次数和与查看/ URL对相关联的排名以及总共接收的点击次数 。 点击相关性排名用于根据所公开的方法的数量来识别和纠正其他排名的错误标记的查询/ URL对。

    Comparative web search
    3.
    发明授权
    Comparative web search 有权
    比较网络搜索

    公开(公告)号:US07571162B2

    公开(公告)日:2009-08-04

    申请号:US11365961

    申请日:2006-03-01

    IPC分类号: G06F17/30 G06F15/16

    摘要: Methods and systems are provided for performing a comparative search. In one example, the comparative search is performed over a network, such as the web, or a database. In one exemplary implementation, a user transmits a plurality of queries which represent the topics that a user wants to compare, and a computing system can automatically retrieve and rank web pages or documents based on both their relevance to queries and the comparative contents they contain. In one such example, the comparative pages are displayed in a pair or other form of a grouping. In another example, comparative results having similar contents may be clustered into meaningful themes.

    摘要翻译: 提供了用于执行比较搜索的方法和系统。 在一个示例中,比较搜索通过诸如网络或数据库的网络执行。 在一个示例性实现中,用户发送表示用户想要比较的主题的多个查询,并且计算系统可以基于它们与查询的相关性及其包含的比较内容来自动检索和排序网页或文档。 在一个这样的示例中,比较页面以一对或其他形式的分组显示。 在另一个例子中,具有相似内容的比较结果可以聚集成有意义的主题。

    Comparative web search
    4.
    发明申请
    Comparative web search 有权
    比较网络搜索

    公开(公告)号:US20070208701A1

    公开(公告)日:2007-09-06

    申请号:US11365961

    申请日:2006-03-01

    IPC分类号: G06F17/30

    摘要: Methods and systems are provided for performing a comparative search. In one example, the comparative search is performed over a network, such as the web, or a database. In one exemplary implementation, a user transmits a plurality of queries which represent the topics that a user wants to compare, and a computing system can automatically retrieve and rank web pages or documents based on both their relevance to queries and the comparative contents they contain. In one such example, the comparative pages are displayed in a pair or other form of a grouping. In another example, comparative results having similar contents may be clustered into meaningful themes.

    摘要翻译: 提供了用于执行比较搜索的方法和系统。 在一个示例中,比较搜索通过诸如网络或数据库的网络执行。 在一个示例性实现中,用户发送表示用户想要比较的主题的多个查询,并且计算系统可以基于它们与查询的相关性及其包含的比较内容来自动检索和排序网页或文档。 在一个这样的示例中,比较页面以一对或其他形式的分组显示。 在另一个例子中,具有相似内容的比较结果可以聚集成有意义的主题。

    Mining broad hidden query aspects from user search sessions
    5.
    发明授权
    Mining broad hidden query aspects from user search sessions 有权
    从用户搜索会话中挖掘广泛的隐藏查询方面

    公开(公告)号:US09305051B2

    公开(公告)日:2016-04-05

    申请号:US12332187

    申请日:2008-12-10

    IPC分类号: G06F17/30

    摘要: An optimization-based framework is utilized to extract broad query aspects from query reformulations performed by users in historical user session logs. Objective functions are optimized to yield query aspects. At run-time, the best broad but unspecified query aspects relevant to any user query are presented along with the results of the run time query.

    摘要翻译: 利用基于优化的框架从历史用户会话日志中的用户执行的查询重新设计中提取广泛的查询方面。 优化目标函数以产生查询方面。 在运行时,与任何用户查询相关的最佳广泛但未指定的查询方面与运行时查询的结果一起显示。

    Web searching
    6.
    发明授权
    Web searching 有权
    网页搜索

    公开(公告)号:US08768919B2

    公开(公告)日:2014-07-01

    申请号:US13599543

    申请日:2012-08-30

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864

    摘要: A human or hand-labeled ranking of URL results for a search query is compared against actual click data for the respective query/URL pairs (e.g., which URLs were actually clicked on by users when the URLs were presented to users when the search query was run in the real world). The human ranking or ordering of the URL results (e.g., pre-existing relevance ranking) for the query can then be adjusted, if necessary, based upon the real world click data (e.g., click relevance ranking). The modified pre-existing relevance ranking can be used in providing future search results.

    摘要翻译: 将搜索查询的URL结果的人或手标记的排序与相应查询/ URL对的实际点击数据进行比较(例如,当搜索查询为当用户显示URL时,用户实际点击了哪些URL 在现实世界中运行)。 然后,如果需要,可以基于真实世界点击数据(例如,点击相关性排名)来调整查询的URL结果的人类排名或排序(例如,预先存在的相关性排名)。 修改的预先存在的相关性排名可用于提供未来的搜索结果。

    Related news articles
    7.
    发明授权
    Related news articles 有权
    相关新闻文章

    公开(公告)号:US08713028B2

    公开(公告)日:2014-04-29

    申请号:US13298932

    申请日:2011-11-17

    IPC分类号: G06F17/30

    摘要: Methods, systems, and computer programs are presented for providing internet content, such as related news articles. One method includes an operation for defining a plurality of candidates based on a seed. For each candidate, scores are calculated for relevance, novelty, connection clarity, and transition smoothness. The score for connection clarity is based on a relevance score of the intersection between the words in the seed and the words in each of the candidates. Further, the score for transition smoothness measures the interest in reading each candidate when transitioning from the seed to the candidate. For each candidate, a relatedness score is calculated based on the calculated scores for relevance, novelty, connection clarity, and transition smoothness. In addition, at least one of the candidates is selected based on their relatedness scores for presentation to the user.

    摘要翻译: 提供方法,系统和计算机程序,用于提供互联网内容,如相关的新闻文章。 一种方法包括基于种子定义多个候选的操作。 对于每个候选人,计算相关性,新颖性,连接清晰度和过渡平滑度的分数。 连接清晰度的分数基于种子中的单词和每个候选词中的单词之间的交集的相关性分数。 此外,过渡平滑度的得分衡量了从种子转移到候选人时阅读每个候选人的兴趣。 对于每个候选人,根据相关性,新颖性,连接清晰度和过渡平滑度的计算分数计算相关性分数。 此外,基于用于呈现给用户的相关性分数来选择至少一个候选者。

    WEB SEARCHING
    8.
    发明申请
    WEB SEARCHING 失效
    WEB搜索

    公开(公告)号:US20110016116A1

    公开(公告)日:2011-01-20

    申请号:US12893107

    申请日:2010-09-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A human or hand-labeled ranking of URL results for a search query is compared against actual click data for the respective query/URL pairs (e.g., which URLs were actually clicked on by users when the URLs were presented to users when the search query was run in the real world). The human ranking or ordering of the URL results (e.g., pre-existing relevance ranking) for the query can then be adjusted, if necessary, based upon the real world click data (e.g., click relevance ranking). The modified pre-existing relevance ranking can be used in providing future search results.

    摘要翻译: 将搜索查询的URL结果的人或手标记的排序与相应查询/ URL对的实际点击数据进行比较(例如,当搜索查询为当用户显示URL时,用户实际点击了哪些URL 在现实世界中运行)。 然后,如果需要,可以基于真实世界点击数据(例如,点击相关性排名)来调整查询的URL结果的人类排名或排序(例如,预先存在的相关性排名)。 修改的预先存在的相关性排名可用于提供未来的搜索结果。

    WEB SEARCHING
    9.
    发明申请
    WEB SEARCHING 失效
    WEB搜索

    公开(公告)号:US20090248657A1

    公开(公告)日:2009-10-01

    申请号:US12056302

    申请日:2008-03-27

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: G06F17/30864

    摘要: Mislabeled URLs are identified and corrected based upon a click relevance ranking computed from user data comprising user click information. The click relevance ranking is formed by applying a set of relevance ordering rules to user log data aggregated by query and URL and by mapping the results of the relevance ordering rules into a linear ordering. For a given query, the aggregated user log data comprises a relative total number of impression, a relative total number of clicks received and a rank associated with the query/URL pair at the time of the total number of impressions and total number of clicks received. The click relevance ranking is used to identify and correct mislabeled query/URL pairs of other rankings according to a number of disclosed methods.

    摘要翻译: 基于由包括用户点击信息的用户数据计算的点击相关性排名来识别和纠正错误标记的URL。 通过将一组相关性排序规则应用于通过查询和URL聚合的用户日志数据并将相关性排序规则的结果映射为线性排序来形成点击相关性排名。 对于给定的查询,聚合的用户日志数据包括相对总曝光次数,接收的相对总点击次数和与查看/ URL对相关联的排名以及总共接收的点击次数 。 点击相关性排名用于根据所公开的方法的数量来识别和纠正其他排名的错误标记的查询/ URL对。

    SEARCHING HETEROGENEOUS INTERRELATED ENTITIES
    10.
    发明申请
    SEARCHING HETEROGENEOUS INTERRELATED ENTITIES 有权
    搜索异构中介实体

    公开(公告)号:US20080215565A1

    公开(公告)日:2008-09-04

    申请号:US11853613

    申请日:2007-09-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867

    摘要: Systems and methods for searching heterogeneous interrelated entities for a heterogeneous entities search query are disclosed herein. A user may enter the heterogeneous entities search query. The search retrieves and returns multiple types of heterogeneous entities. The retrieved heterogeneous interrelated entities are searched in a unified matrix that represents relationships between one or more heterogeneous entities. The retrieved heterogeneous interrelated entities may have one or more entity types. The set of retrieved interrelated entities may also be ranked based on the similarity between each entity and the search query. Feedback may also be incorporated into the system to improve search accuracy.

    摘要翻译: 本文公开了用于异构实体搜索查询的异构相关实体的搜索的系统和方法。 用户可以输入异构实体搜索查询。 搜索检索并返回多种类型的异构实体。 检索到的异构相关实体在表示一个或多个异构实体之间的关系的统一矩阵中进行搜索。 所检索的异构相关实体可以具有一个或多个实体类型。 还可以根据每个实体和搜索查询之间的相似性对所检索的相关实体的集合进行排名。 反馈也可以并入系统以提高搜索精度。