摘要:
Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.
摘要:
Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
摘要:
Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
摘要:
Techniques are described for generating structured information from semi-structured web pages, and retrieving the structured knowledge in response to a user query that indicates a query intent. The structured information is automatically extracted offline from semi-structured web pages, through the use of an auto wrapper solution that is noise tolerant, scalable, and automatic. The structured information is stored in a knowledge base, and provided in response to a user search query that indicates a query intent. Extraction of structured information may also include clustering of pages based on their measured similarities. The clusters may be determined based on similar elements in the tag path text data of the pages. A minimum size threshold may be applied to the clusters.
摘要:
A smart user-centric information aggregation system allows a user to define a region of content displayed in a display of a device and performs information aggregation on behalf of the user. The smart user-centric information aggregation system searches, aggregates and groups information related to content included in the region of content for the user while the user can continue to perform his/her original course of actions without interruption. After finding information related to the desired content, the smart user-centric information aggregation system may notify the user and present the found information to the user upon receiving confirmation from the user. The smart user-centric information aggregation system may continue to find new related information and update the presentation with the newly found information periodically, in some instances without user intervention or input.
摘要:
Embodiments facilitate greater flexibility in definition of user segments for targeted advertising, by employing indexed semantic user profiles. Semantic user profiles are built through extraction of online user behavior data such as user search queries and page views, and include user interest information that is inferred based on user behavior. Semantic user profiles are then indexed to facilitate search for a set of users that fit specified semantic search terms. Search results for semantic profiles are ranked according to a ranking model developed through machine learning. In some embodiments, building and indexing of semantic profiles and learning of the ranking model is performed offline to facilitate more efficient online processing of queries.
摘要:
Various technologies pertaining to provision of graphical data to a client computing device responsive to receipt of a positional selection on a web page by a user of a client computing device are described herein. A computer-executable application executing on the client computing device detects that the user has selected a certain position on the web page, wherein this application is not called by code of the web page. The position is transmitted to an ad server, which conducts an auction for display space on the client computing device based at least in part upon the detection of the certain position.
摘要:
A classification process may reduce the computational resources and time required to collect and classify training data utilized to enable a user to effectively access online information. According to some implementations, training data is established by defining one or more seed queries and query patterns. A bi-partite graph may be constructed using the seed query and query pattern information. A traversal of the bi-partite graph can be performed to expand the training data to encompass sufficient data to perform classification of the present search task.
摘要:
Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.
摘要:
The related links recommendation technique described herein employs combined collaborative filtering to recommend related web pages to users. The technique creates multiple collaborative filters which are combined in order to create a combined collaborative filter to recommend web pages similar to a given web page to a user. One query-based collaborative filter is created by using query search clicks (e.g., user input device selection actions on search results returned in response to a search query). Another user-behavior-based collaborative filter is created by using query search clicks and user clicks while browsing websites (e.g., user input device selection actions while a user is browsing websites). Lastly, another content-based collaborative filter based on similar content of web pages is created by finding web pages with similar content.