摘要:
User-generated tags from viewing web-based content are collected over a predetermined period of time. A subset of distinct or unique tags is identified from among the collected tags. A z-score is calculated for each identified distinct tag, where the z-score is a measure of the statistical significance of the tag. The subset of distinct tags is then sorted based on their corresponding z-score. All distinct tags having a corresponding z-score lower than a predetermined threshold are rejected and the remaining distinct tags, having a corresponding z-score higher than the threshold are used to infer a user's interest. The ability to infer a user's interests from the remaining distinct tags may thus benefit web-based applications by achieving a high degree of accuracy in predicting the interests of users by leveraging on the use of the user generated content tags and keywords.
摘要:
There are provided methods and systems for inferring a user's interests from user-generated tags of web-based content. In accordance with the invention, user-generated tags from viewing web-based content are collected over a predetermined period of time. A subset of distinct or unique tags is identified from among the collected tags. A z-score is calculated for each identified distinct tag, where the z-score is a measure of the statistical significance of the tag. The subset of distinct tags is then sorted based on their corresponding z-score. All distinct tags having a corresponding z-score lower than a predetermined threshold are rejected and the remaining distinct tags, having a corresponding z-score higher than the threshold are used to infer a user's interest. The ability to infer a user's interests from the remaining distinct tags may thus benefit web-based applications by achieving a high degree of accuracy in predicting the interests of users by leveraging on the use of the user generated content tags and keywords.
摘要:
In one embodiment, access one or more query chains, wherein each one of the query chains comprises two or more search queries, {q1, . . . , qn}, which are recency-sensitive, are related to the same subject matter, and are issued to a search engine sequentially, and actual click-through information associated with each one of the query chains; and smooth each one of the query chains using the actual click-through information associated with the query chain. To smooth one of the query chains comprises, for each one of search queries, qj, in the query chain, where 2≦j≦n, if one of the network resources identified for qj has actually been clicked in connection with qj by the corresponding one network user, then presume that the one network resource has been clicked in connection with one or more search queries, qk, in the query chain, where 1≦k
摘要翻译:在一个实施例中,访问一个或多个查询链,其中每个查询链包括两个或多个搜索查询{q1,..., 。 。 ,qn},它们是新近度敏感的,与相同的主题相关,并且被顺序地发布到搜索引擎,并且与每个查询链相关联的实际点击信息; 并使用与查询链相关联的实际点击信息来平滑每个查询链。 为了平滑一个查询链,对于查询链中的每个搜索查询,包括qj,其中2≦̸ j≦̸ n,如果为qj标识的一个网络资源实际上已经被qj与点对点相关联 一个网络用户,然后假设一个网络资源已被连接到查询链中的一个或多个搜索查询qk,其中1≦̸ k
摘要:
In one embodiment, access one or more query-resource pairs, wherein for each one of the query-resource pairs comprising one of one or more search queries and one of one or more network resources, the one search query is recency-sensitive with respect to a particular time period, and the one network resource is identified for the one search query, and a resource-view count and a resource-click count associated with each one of the query-resource pairs; and construct one or more first click features using the resource-view counts and the resource-click counts associated with the query-resource pairs. To construct one of the first click features in connection with one of the query-resource pairs comprises determine a only-resource-click count associated with the one query-resource pair; and calculate a ratio between the only-resource-click count and the resource-view count associated with the one query-resource pair as the one first click feature.
摘要:
The present invention provides for the detection of abnormal user behavior for a query session of an electronic search engine. A query session is initiated upon receipt of a user search request that includes one or more search terms. The search engine, in accordance with known search technology, generates a search results page that includes various hyperlinks, including for example web content hyperlinks, page navigation hyperlinks and advertising hyperlinks. Tracking user activities generates the clickstream associated with the search results page. The present invention determines a probability score for the clickstream and then this score is normalized. A comparison of the normalized probability score with other normalized probability scores for similar query sessions determines of the normalcy of the query session.
摘要:
A server determines a plurality of immediate candidate items for a first web page to recommend to a user. For each particular immediate candidate item of the plurality of immediate candidate items, the server determines a separate sequence of two or more subsequent possible candidate items for subsequent web pages to recommend to the user in the event that the user selects the particular immediate candidate item. Further, the server selects a particular immediate candidate item from the plurality of immediate candidate items for the first web page to recommend to the user. The first web page that recommends the plurality of immediate candidate items is generated and sent over the Internet to the user.
摘要:
In one embodiment, access one or more query-resource pairs, wherein for each one of the query-resource pairs comprising one of one or more search queries and one of one or more network resources, the one search query is recency-sensitive with respect to a particular time period, and the one network resource is identified for the one search query, and a resource-view count and a resource-click count associated with each one of the query-resource pairs; and construct one or more first click features using the resource-view counts and the resource-click counts associated with the query-resource pairs. To construct one of the first click features in connection with one of the query-resource pairs comprises determine a only-resource-click count associated with the one query-resource pair; and calculate a ratio between the only-resource-click count and the resource-view count associated with the one query-resource pair as the one first click feature.
摘要:
The present invention provides for the detection of abnormal user behavior for a query session of an electronic search engine. A query session is initiated upon receipt of a user search request that includes one or more search terms. The search engine, in accordance with known search technology, generates a search results page that includes various hyperlinks, including for example web content hyperlinks, page navigation hyperlinks and advertising hyperlinks. Tracking user activities generates the clickstream associated with the search results page. The present invention determines a probability score for the clickstream and then this score is normalized. A comparison of the normalized probability score with other normalized probability scores for similar query sessions determines of the normalcy of the query session.
摘要:
A server determines a plurality of immediate candidate items for a first web page to recommend to a user. For each particular immediate candidate item of the plurality of immediate candidate items, the server determines a separate sequence of two or more subsequent possible candidate items for subsequent web pages to recommend to the user in the event that the user selects the particular immediate candidate item. Further, the server selects a particular immediate candidate item from the plurality of immediate candidate items for the first web page to recommend to the user. The first web page that recommends the plurality of immediate candidate items is generated and sent over the Internet to the user.
摘要:
In one embodiment, access one or more query chains, wherein each one of the query chains comprises two or more search queries, {q1, . . . , qn}, which are recency-sensitive, are related to the same subject matter, and are issued to a search engine sequentially, and actual click-through information associated with each one of the query chains; and smooth each one of the query chains using the actual click-through information associated with the query chain. To smooth one of the query chains comprises, for each one of search queries, qj, in the query chain, where 2≦j≦n, if one of the network resources identified for qj has actually been clicked in connection with qj by the corresponding one network user, then presume that the one network resource has been clicked in connection with one or more search queries, qk, in the query chain, where 1≦k
摘要翻译:在一个实施例中,访问一个或多个查询链,其中每个查询链包括两个或多个搜索查询{q1,..., 。 。 ,qn},它们是新近度敏感的,与相同的主题相关,并且被顺序地发布到搜索引擎,并且与每个查询链相关联的实际点击信息; 并使用与查询链相关联的实际点击信息来平滑每个查询链。 为了平滑一个查询链,对于查询链中的每个搜索查询,包括qj,其中2≦̸ j≦̸ n,如果为qj标识的一个网络资源实际上已经被qj与点对点相关联 一个网络用户,然后假设一个网络资源已被连接到查询链中的一个或多个搜索查询qk,其中1≦̸ k