摘要:
Systems and methods are described for managing annotations in pen-based computing systems. The systems and methods described herein provide ways to collect, manage, search and share personal information entered by way of handwritten annotations. Annotations are used to drive applications, serve as gestures, find related information and to further manage information. Context information is obtained when a user enters an annotation, and is used to assist in determining and locating relevant content in which the user may be interested, whether in the same document or a different document located on a local computer or on the Internet or other network.
摘要:
Systems and methods are described for managing annotations in pen-based computing systems. The systems and methods described herein provide ways to collect, manage, search and share personal information entered by way of handwritten annotations. Annotations are used to drive applications, serve as gestures, find related information and to further manage information. Context information is obtained when a user enters an annotation, and is used to assist in determining and locating relevant content in which the user may be interested, whether in the same document or a different document located on a local computer or on the Internet or other network.
摘要:
Described herein is technology for, among other things, mining similar user clusters based on user advertisement click behaviors. The technology involves methods and systems for mining similar user clusters based on log data available on an online advertising platform. By building a user linkage representation based on one or more attributes from the log data, the similar user clusters can be harvested in more efficient manner.
摘要:
Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要:
Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.
摘要:
Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
摘要:
A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.
摘要:
Embodiments of the claimed subject matter provide a method and system for predicting bidding keyword monetization. The claimed subject matter provides a method and system with which the value of a keyword for the purpose of relevant online advertisement may be evaluated according to various metrics to determine a bidding landscape for use in advertising campaigns. The value of the keyword considers certain attributes related to the monetization of the keyword.One embodiment of the claimed subject matter is implemented as a method for predicting keyword monetization for one or more keyword-advertisement relationships. Historical data for the one or more keyword-advertisement relationships is referenced and used to generate a global model of the one or more keyword-advertisement relationship. The relationships are then evaluated according to a time-series analysis, which parses the data from the historical data and the global model to create predictions for the keyword monetization according to the keyword-advertisement relationships.
摘要:
Computer-readable media having computer-executable instructions and apparatuses categorize documents or corpus of documents. A Tensor Space Model (TSM), which models the text by a higher-order tensor, represents a document or a corpus of documents. Supported by techniques of multilinear algebra, TSM provides a framework for analyzing the multifactor structures. TSM is further supported by operations and presented tools, such as the High-Order Singular Value Decomposition (HOSVD) for a reduction of the dimensions of the higher-order tensor. The dimensionally reduced tensor is compared with tensors that represent possible categories. Consequently, a category is selected for the document or corpus of documents. Experimental results on the dataset for 20 Newsgroups suggest that TSM is advantageous to a Vector Space Model (VSM) for text classification.
摘要:
Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.