-
公开(公告)号:US20120226661A1
公开(公告)日:2012-09-06
申请号:US13040261
申请日:2011-03-03
申请人: Krishnaram N. G. Kenthapadi , Shuai Ding , Sreenivas Gollapudi , Samuel Ieong , Alexandros Ntoulas
发明人: Krishnaram N. G. Kenthapadi , Shuai Ding , Sreenivas Gollapudi , Samuel Ieong , Alexandros Ntoulas
IPC分类号: G06F17/30
CPC分类号: G06F17/30017 , G06F17/30864
摘要: Documents are replicated among servers comprising a search engine based on the value of each document by approximating its value as one of the top search results for one or more exemplary queries. Documents are allocated among servers comprising a search engine by calculating a relevance value for each document and then distributing the documents evenly to the servers. A subset of servers are selected from among a plurality of servers comprising a search engine using term-based, server-specific histograms reflecting the number of instances of the term in each document allocated to each server, and then selecting servers to service a query based on the documents on those servers.
摘要翻译: 文档在包含基于每个文档的值的搜索引擎的服务器之间通过将其值近似作为一个或多个示例性查询的顶部搜索结果之一来复制。 通过计算每个文档的相关性值,然后将文档均匀地分发到服务器,在包含搜索引擎的服务器之间分配文档。 从包括使用基于术语的服务器特定直方图的搜索引擎的多个服务器中选择服务器的子集,所述服务器特定的直方图反映分配给每个服务器的每个文档中的术语的实例的数量,然后选择服务于基于查询的服务器 在这些服务器上的文档上。
-
公开(公告)号:US08458130B2
公开(公告)日:2013-06-04
申请号:US13040261
申请日:2011-03-03
申请人: Krishnaram N. G. Kenthapadi , Shuai Ding , Sreenivas Gollapudi , Samuel Ieong , Alexandros Ntoulas
发明人: Krishnaram N. G. Kenthapadi , Shuai Ding , Sreenivas Gollapudi , Samuel Ieong , Alexandros Ntoulas
IPC分类号: G06F17/30
CPC分类号: G06F17/30017 , G06F17/30864
摘要: Documents are replicated among servers comprising a search engine based on the value of each document by approximating its value as one of the top search results for one or more exemplary queries. Documents are allocated among servers comprising a search engine by calculating a relevance value for each document and then distributing the documents evenly to the servers. A subset of servers are selected from among a plurality of servers comprising a search engine using term-based, server-specific histograms reflecting the number of instances of the term in each document allocated to each server, and then selecting servers to service a query based on the documents on those servers.
摘要翻译: 文档在包含基于每个文档的值的搜索引擎的服务器之间通过将其值近似作为一个或多个示例性查询的顶部搜索结果之一来复制。 通过计算每个文档的相关性值,然后将文档均匀地分发到服务器,在包含搜索引擎的服务器之间分配文档。 从包括使用基于术语的服务器特定直方图的搜索引擎的多个服务器中选择服务器的子集,所述服务器特定的直方图反映分配给每个服务器的每个文档中的术语的实例的数量,然后选择服务于基于查询的服务器 在这些服务器上的文档上。
-
公开(公告)号:US20090313286A1
公开(公告)日:2009-12-17
申请号:US12140272
申请日:2008-06-17
申请人: Nina Mishra , Rakesh Agrawal , Sreenivas Gollapudi , Alan Halverson , Krishnaram N. G. Kenthapadi , Rina Panigrahy , John C. Shafer , Panayiotis Tsaparas
发明人: Nina Mishra , Rakesh Agrawal , Sreenivas Gollapudi , Alan Halverson , Krishnaram N. G. Kenthapadi , Rina Panigrahy , John C. Shafer , Panayiotis Tsaparas
CPC分类号: G06F16/9535
摘要: Data from a click log may be used to generate training data for a search engine. The pages clicked as well as the pages skipped by a user may be used to assess the relevance of a page to a query. Labels for training data may be generated based on data from the click log. The labels may pertain to the relevance of a page to a query.
摘要翻译: 来自点击日志的数据可用于生成搜索引擎的训练数据。 可以使用点击的页面以及用户跳过的页面来评估页面与查询的相关性。 可以根据点击日志的数据生成训练数据的标签。 标签可能与页面与查询的相关性有关。
-
公开(公告)号:US08612432B2
公开(公告)日:2013-12-17
申请号:US12816389
申请日:2010-06-16
CPC分类号: G06F17/30979
摘要: A tree structure has a node associated with each category of a hierarchy of item categories. Child nodes of the tree are associated with sub-categories of the categories associated with parent nodes. Training data including received queries and indicators of a selected item category for each received query is combined with the tree structure by associating each query with the node corresponding to the selected category of the query. When a query is received, a classifier is applied to the nodes to generate a probability that the query is intended to match an item of the category associated with the node. The classifier is applied until the probability is below a threshold. One or more categories associated with the nodes that are closest to the intent of the received query are selected and indicators of items of those categories that match the received query are output.
摘要翻译: 树结构具有与项目类别的层次结构的每个类别相关联的节点。 树的子节点与与父节点相关联的类别的子类别相关联。 通过将每个查询与对应于所选择的查询类别的节点相关联,将包括接收到的查询和针对每个接收到的查询的所选项目类别的指示符的训练数据与树结构组合。 当接收到查询时,分类器被应用于节点以产生查询旨在匹配与节点相关联的类别的项目的概率。 应用分类器直到概率低于阈值。 选择与接收到的查询的意图最接近的节点相关联的一个或多个类别,并输出与接收到的查询匹配的那些类别的项目的指示符。
-
公开(公告)号:US20110314012A1
公开(公告)日:2011-12-22
申请号:US12816389
申请日:2010-06-16
IPC分类号: G06F17/30
CPC分类号: G06F17/30979
摘要: A tree structure has a node associated with each category of a hierarchy of item categories. Child nodes of the tree are associated with sub-categories of the categories associated with parent nodes. Training data including received queries and indicators of a selected item category for each received query is combined with the tree structure by associating each query with the node corresponding to the selected category of the query. When a query is received, a classifier is applied to the nodes to generate a probability that the query is intended to match an item of the category associated with the node. The classifier is applied until the probability is below a threshold. One or more categories associated with the nodes that are closest to the intent of the received query are selected and indicators of items of those categories that match the received query are output.
摘要翻译: 树结构具有与项目类别的层次结构的每个类别相关联的节点。 树的子节点与与父节点相关联的类别的子类别相关联。 通过将每个查询与对应于所选择的查询类别的节点相关联,将包括接收到的查询和针对每个接收到的查询的所选项目类别的指示符的训练数据与树结构组合。 当接收到查询时,分类器被应用于节点以产生查询旨在匹配与节点相关联的类别的项目的概率。 应用分类器直到概率低于阈值。 选择与接收到的查询的意图最接近的节点相关联的一个或多个类别,并输出与接收到的查询匹配的那些类别的项目的指示符。
-
公开(公告)号:US20110145227A1
公开(公告)日:2011-06-16
申请号:US12639021
申请日:2009-12-16
IPC分类号: G06F17/30
CPC分类号: G06F17/30867
摘要: A query may be received at a computing device through a network. One or more attribute values that are preferences for a subset of the one or more terms of the query may be identified by the computing device. One or more products or services having associated attributes that have values that match a subset of the identified attribute values may be identified by the computing device, and a subset of the identified products or services may be presented by the computing device through the network. Implementations may also identify latent preferences, that is, preferences that are found for a query even where such a preference is not explicitly part of a term or token of the query.
摘要翻译: 可以通过网络在计算设备处接收查询。 可以由计算设备识别作为查询的一个或多个项的子集的偏好的一个或多个属性值。 具有与所识别的属性值的子集匹配的值的相关联属性的一个或多个产品或服务可由计算设备识别,并且所识别的产品或服务的子集可以由计算设备通过网络呈现。 实现还可以识别潜在的偏好,即,即使在这样的偏好不是查询的术语或令牌的明确部分的情况下,也可以查询查询的偏好。
-
公开(公告)号:US20090306996A1
公开(公告)日:2009-12-10
申请号:US12133370
申请日:2008-06-05
IPC分类号: G06Q99/00
摘要: A social network may be used to determine a rating of a user with no prior history. The ratings for unrated nodes may be inferred from the existing ratings of users associated with the unrated node in either or both the underlying social network or other social networks. Additionally in some implementations, the effect of the rating of a rated node to an unrated node diminishes as the strength of their relationships decreases. In some cases, a social network may be modeled as an electrical network, and ratings may be modeled as voltages on the nodes of the social network, relationships in the social network may be modeled as connections in the electrical network, and in some cases the strength of relationships may be modeled as conductance of the connections. Ratings for nodes may be determined using Kirchhoff's Law and in some cases by solving a set of linear equations or by propagating positive and negative ratings using a random walk with absorbing states.
摘要翻译: 可以使用社交网络来确定没有先前历史的用户的评级。 对未分配节点的评级可以从在基础社交网络或其他社交网络中的任一个或两者中与未分级节点相关联的用户的现有评级推断。 另外在一些实施方式中,额定节点的额定值对未分级节点的影响随着其关系的强度减小而减小。 在某些情况下,社交网络可能被建模为电网,评级可以被建模为社交网络节点上的电压,社交网络中的关系可以被建模为电网中的连接,并且在某些情况下 关系的力量可能被建模为连接的传导。 可以使用基尔霍夫定律确定节点的等级,并且在某些情况下通过求解一组线性方程式,或者通过使用具有吸收状态的随机游走来传播正和负的额定值。
-
公开(公告)号:US08612472B2
公开(公告)日:2013-12-17
申请号:US12639021
申请日:2009-12-16
IPC分类号: G06F17/30
CPC分类号: G06F17/30867
摘要: A query may be received at a computing device through a network. One or more attribute values that are preferences for a subset of the one or more terms of the query may be identified by the computing device. One or more products or services having associated attributes that have values that match a subset of the identified attribute values may be identified by the computing device, and a subset of the identified products or services may be presented by the computing device through the network. Implementations may also identify latent preferences, that is, preferences that are found for a query even where such a preference is not explicitly part of a term or token of the query.
摘要翻译: 可以通过网络在计算设备处接收查询。 可以由计算设备识别作为查询的一个或多个项的子集的偏好的一个或多个属性值。 具有与所识别的属性值的子集匹配的值的相关联属性的一个或多个产品或服务可由计算设备识别,并且所识别的产品或服务的子集可以由计算设备通过网络呈现。 实现还可以识别潜在的偏好,即,即使在这样的偏好不是查询的术语或令牌的明确部分的情况下,也可以查询查询的偏好。
-
-
-
-
-
-
-