-
公开(公告)号:US20100318546A1
公开(公告)日:2010-12-16
申请号:US12485058
申请日:2009-06-16
IPC分类号: G06F17/30
CPC分类号: G06F17/30867
摘要: Described is releasing output data representing a search log, in which the data is suitable for most data mining/analysis applications, but is safe to publish by preserving user privacy. The search log is processed such that a query is only included if a sufficient count of that query is present; noise may be added. User contributions that are considered may be limited to a maximum number of queries. The output may indicate how often (possibly plus noise) that each query appeared. Other output may comprise a query-action graph, a query-inaction graph and/or a query-reformulation graph, with nodes representing queries and nodes representing actions, inactions or reformulations (e.g., clicked URLs, skipped URLs, or selected related queries), and edges between nodes representing action, skip or selection counts (possibly plus noise). The output may correspond to the top results/related queries returned from a search.
摘要翻译: 描述了释放表示搜索日志的输出数据,其中数据适用于大多数数据挖掘/分析应用程序,但通过保护用户隐私来安全地发布。 处理搜索日志,使得仅当存在该查询的足够计数时才包括查询; 可能会添加噪音。 被考虑的用户贡献可能被限制为最大数量的查询。 输出可以指示每个查询出现的频率(可能加噪声)。 其他输出可以包括查询动作图,查询反应图和/或查询重构图,其中节点表示查询和节点,表示动作,不作为或重新设定(例如,点击的URL,跳过的URL或选择的相关查询) ,并且节点之间的边缘表示动作,跳过或选择计数(可能加上噪声)。 输出可以对应于从搜索返回的最高结果/相关查询。
-
公开(公告)号:US08601024B2
公开(公告)日:2013-12-03
申请号:US12485058
申请日:2009-06-16
IPC分类号: G06F17/30
CPC分类号: G06F17/30867
摘要: Described is releasing output data representing a search log, in which the data is suitable for most data mining/analysis applications, but is safe to publish by preserving user privacy. The search log is processed such that a query is only included if a sufficient count of that query is present; noise may be added. User contributions that are considered may be limited to a maximum number of queries. The output may indicate how often (possibly plus noise) that each query appeared. Other output may comprise a query-action graph, a query-inaction graph and/or a query-reformulation graph, with nodes representing queries and nodes representing actions, inactions or reformulations (e.g., clicked URLs, skipped URLs, or selected related queries), and edges between nodes representing action, skip or selection counts (possibly plus noise). The output may correspond to the top results/related queries returned from a search.
摘要翻译: 描述了释放表示搜索日志的输出数据,其中数据适用于大多数数据挖掘/分析应用程序,但通过保护用户隐私来安全地发布。 处理搜索日志,使得仅当存在该查询的足够计数时才包括查询; 可能会添加噪音。 被考虑的用户贡献可能被限制为最大数量的查询。 输出可以指示每个查询出现的频率(可能加噪声)。 其他输出可以包括查询动作图,查询反应图和/或查询重构图,其中节点表示查询和节点,表示动作,不作为或重新设定(例如,点击的URL,跳过的URL或选择的相关查询) ,并且节点之间的边缘表示动作,跳过或选择计数(可能加上噪声)。 输出可以对应于从搜索返回的最高结果/相关查询。
-
公开(公告)号:US08788498B2
公开(公告)日:2014-07-22
申请号:US12484255
申请日:2009-06-15
IPC分类号: G06F17/30
CPC分类号: G06Q10/10
摘要: Described is a technology for obtaining labeled sample data. Labeling guidelines are converted into binary yes/no questions regarding data samples. The questions and data samples are provided to judges who then answer the questions for each sample. The answers are input to a label assignment algorithm that associates a label with each sample based upon the answers. If the guidelines are modified and previous answers to the binary questions are maintained, at least some of the previous answers may be used in re-labeling the samples in view of the modification.
摘要翻译: 描述了用于获得标记的样本数据的技术。 标签指南被转换为关于数据样本的二进制是/否问题。 问题和数据样本提供给那些随后回答每个样本的问题的法官。 将答案输入到标签分配算法,该算法根据答案将标签与每个样本相关联。 如果修改了指南并维护了二进制问题的以前的答案,则鉴于修改,至少可以使用一些以前的答案来重新标记样本。
-
公开(公告)号:US08219539B2
公开(公告)日:2012-07-10
申请号:US12419363
申请日:2009-04-07
申请人: Alan Dale Halverson , Krishnaram Kenthapadi , Nina Mishra , Aleksandrs Slivkins , Umar Ali Syed
发明人: Alan Dale Halverson , Krishnaram Kenthapadi , Nina Mishra , Aleksandrs Slivkins , Umar Ali Syed
CPC分类号: G06F17/30864
摘要: Techniques and systems are disclosed for returning temporally-aware results from an Internet-based search query. To determine if a query is temporally-based one or more query features are collected and input into a trained classifier, yielding a temporal classification for the query. Further, if a query is classified as temporal, the query results are shifted by determining an alternate set of results for the query, and returning one or more alternate results to one or more users. Based on user interactions with the one or more alternate results, the classifier can be updated, for example, by changing the query to a non-temporal query if the user interactions identify it as such.
摘要翻译: 公开了用于从基于因特网的搜索查询返回时间感知结果的技术和系统。 为了确定查询是否是基于时间的,一个或多个查询特征被收集并输入到经过训练的分类器中,产生查询的时间分类。 此外,如果查询被分类为时间,则通过为查询确定替代的一组结果并将一个或多个替代结果返回给一个或多个用户来移动查询结果。 基于与一个或多个替代结果的用户交互,分类器可以被更新,例如,如果用户交互识别它,例如通过将查询改变为非时间查询。
-
公开(公告)号:US20100318539A1
公开(公告)日:2010-12-16
申请号:US12484255
申请日:2009-06-15
IPC分类号: G06F17/30
CPC分类号: G06Q10/10
摘要: Described is a technology for obtaining labeled sample data. Labeling guidelines are converted into binary yes/no questions regarding data samples. The questions and data samples are provided to judges who then answer the questions for each sample. The answers are input to a label assignment algorithm that associates a label with each sample based upon the answers. If the guidelines are modified and previous answers to the binary questions are maintained, at least some of the previous answers may be used in re-labeling the samples in view of the modification.
摘要翻译: 描述了用于获得标记的样本数据的技术。 标签指南被转换为关于数据样本的二进制是/否问题。 问题和数据样本提供给那些随后回答每个样本的问题的法官。 将答案输入到标签分配算法,该算法根据答案将标签与每个样本相关联。 如果修改了指南并维护了二进制问题的以前的答案,则鉴于修改,至少可以使用一些以前的答案来重新标记样本。
-
公开(公告)号:US20100257164A1
公开(公告)日:2010-10-07
申请号:US12419363
申请日:2009-04-07
申请人: Alan Dale Halverson , Krishnaram Kenthapadi , Nina Mishra , Aleksandrs Slivkins , Umar Ali Syed
发明人: Alan Dale Halverson , Krishnaram Kenthapadi , Nina Mishra , Aleksandrs Slivkins , Umar Ali Syed
IPC分类号: G06F17/30
CPC分类号: G06F17/30864
摘要: Techniques and systems are disclosed for returning temporally-aware results from an Internet-based search query. To determine if a query is temporally-based one or more query features are collected and input into a trained classifier, yielding a temporal classification for the query. Further, if a query is classified as temporal, the query results are shifted by determining an alternate set of results for the query, and returning one or more alternate results to one or more users. Based on user interactions with the one or more alternate results, the classifier can be updated, for example, by changing the query to a non-temporal query if the user interactions identify it as such.
摘要翻译: 公开了用于从基于因特网的搜索查询返回时间感知结果的技术和系统。 为了确定查询是否是基于时间的,一个或多个查询特征被收集并输入到经过训练的分类器中,产生查询的时间分类。 此外,如果查询被分类为时间,则通过为查询确定替代的一组结果并将一个或多个替代结果返回给一个或多个用户来移动查询结果。 基于与一个或多个替代结果的用户交互,分类器可以被更新,例如,如果用户交互识别它,例如通过将查询改变为非时间查询。
-
-
-
-
-