-
公开(公告)号:US20110040769A1
公开(公告)日:2011-02-17
申请号:US12541063
申请日:2009-08-13
申请人: Huihsin Tseng , Longbin Chen , Yumao Lu , Fachun Peng
发明人: Huihsin Tseng , Longbin Chen , Yumao Lu , Fachun Peng
IPC分类号: G06F17/30
CPC分类号: G06F16/951
摘要: In one embodiment, access one or more pairs of search query and clicked Uniform Resource Locator (URL). For each of the pairs of search query and clicked URL, segment the search query into one or more query segments and the clicked URL into one or more URL segments; construct one or more query-URL n-grams, each of which comprises a query part comprising at least one of the query segments and a URL part comprising at least one of the URL segments; and calculate one or more association scores, each of which for one of the query-URL n-grams and represents a similarity between the query part and the URL part of the query-URL n-gram and is based on a first frequency of the query part and the URL part, a second frequency of the query part, and a third frequency of the URL part.
摘要翻译: 在一个实施例中,访问一对或多对搜索查询和点击的统一资源定位符(URL)。 对于每一对搜索查询和点击的URL,将搜索查询分割成一个或多个查询段,并将点击的URL分段成一个或多个URL段; 构造一个或多个查询URL n克,每个查询URL n-gram包括包括至少一个查询段的查询部分和包括至少一个URL段的URL部分; 并且计算一个或多个关联分数,其中每个关联分数中的每一个用于查询URL n-gram中的一个,并且表示查询部分与查询URL n-gram的URL部分之间的相似度,并且基于第一频率 查询部分和URL部分,查询部分的第二个频率,以及URL部分的第三个频率。