Abstract:
Many search engines attempt to understand and predict a user's search intent after the submission of search queries. Predicting search intent allows search engines to tailor search results to particular information needs of the user. Unfortunately, current techniques passively predict search intent after a query is submitted. Accordingly, one or more systems and/or techniques for actively predicting search intent from user browsing behavior data are disclosed herein. For example, search patterns of a user browsing a web page and shortly thereafter performing a query may be extracted from user browsing behavior. Queries within the search patterns may be ranked based upon a search trigger likelihood that content of the web page motivated the user to perform the query. In this way, query suggestions having a high search trigger likelihood and a diverse range of topics may be generated and/or presented to users of the web page.
Abstract:
An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool's use of the spam list.
Abstract:
The page ranking technique described herein employs a Markov Skeleton Mirror Process (MSMP), which is a particular case of Markov Skeleton Processes, to model and calculate page importance scores. Given a web graph and its metadata, the technique builds an MSMP model on the web graph. It first estimates the stationary distribution of a EMC and views it as transition probability. It next computes the mean staying time using the metadata. Finally, it calculates the product of transition probability and mean staying time, which is actually the stationary distribution of MSMP. This is regarded as page importance.
Abstract:
An anti-spam technique for protecting search engine ranking is based on mining search engine optimization (SEO) forums. The anti-spam technique collects webpages such as SEO forum posts from a list of suspect spam websites, and extracts suspicious link exchange URLs and corresponding link formation from the collected webpages. A search engine ranking penalty is then applied to the suspicious link exchange URLs. The penalty is at least partially determined by the link information associated with the respective suspicious link exchange URL. To detect more suspicious link exchange URLs, the technique may propagate one or more levels from a seed set of suspicious link exchange URLs generated by mining SEO forums.
Abstract:
A clustering system generates an original Laplacian matrix representing objects and their relationships. The clustering system initially applies an eigenvalue decomposition solver to the original Laplacian matrix for a number of iterations. The clustering system then identifies the elements of the resultant eigenvector that are stable. The clustering system then aggregates the elements of the original Laplacian matrix corresponding to the identified stable elements and forms a new Laplacian matrix that is a compressed form of the original Laplacian matrix. The clustering system repeats the applying of the eigenvalue decomposition solver and the generating of new compressed Laplacian matrices until the new Laplacian matrix is small enough so that a final solution can be generated in a reasonable amount of time.
Abstract:
A method and system for determining whether a web site is a spam web site based on analysis of changes in link information over time is provided. A spam detection system collects link information for a web site at various times. The spam detection system extracts one or more features from the link information that relate to changes in the link information over time. The spam detection system then generates an indication of whether the web site is a spam web site using a classifier that has been trained to detect whether the extracted feature indicates that the web site is likely to be spam.
Abstract:
Some implementations provide techniques for selecting web pages for inclusion in an index. For example, some implementations apply regularization to select a subset of the crawled web pages for indexing based on link relationships between the crawled web pages, features extracted from the crawled web pages, and user behavior information determined for at least some of the crawled web pages. Further, in some implementations, the user behavior information may be used to sort a training set of crawled web pages into a plurality of labeled groups. The labeled groups may be represented in a directed graph that indicates relative priorities for being selected for indexing.
Abstract:
An advertisement perception predictor may forecast the effectiveness of an online advertisement in a web page by predicting whether the online advertisement may be perceived by a consumer. The advertisement perception predictor may use a perception model that is trained for determining perception probability values of online advertisements. The perception model may be applied to an online advertisement to determine a perception probability value for the online advertisement. The perception probability value may indicate the likelihood that a consumer is likely to view the online advertisement.
Abstract:
A method and system for accessing a cellular mobile communication network, the method includes: after a terminal and a base station complete a ranging process, the terminal carrying out a basic capability negotiation with the base station, the base station and the terminal carrying out a WAPI access authentication process; and the terminal carrying out a subsequent access flow to access the cellular mobile communication network; wherein the WAPI access authentication process includes: the terminal sending an access authentication request packet, including a certificate and a signature of the terminal, to the base station; the base station authenticating the signature of the terminal, including the certificate into a certificate authentication request packet to send to an authentication server to perform validation; the base station sending an access authentication response packet to the terminal, and carrying out a unicast session key negotiation with the terminal to obtain a unicast session key.
Abstract:
The page ranking technique described herein employs a Markov Skeleton Mirror Process (MSMP), which is a particular case of Markov Skeleton Processes, to model and calculate page importance scores. Given a web graph and its metadata, the technique builds an MSMP model on the web graph. It first estimates the stationary distribution of a EMC and views it as transition probability. It next computes the mean staying time using the metadata. Finally, it calculates the product of transition probability and mean staying time, which is actually the stationary distribution of MSMP. This is regarded as page importance.