摘要:
A topic identification system identifies topics of online discussions by iteratively identifying topic words or keywords of the online discussions and identifying language patterns associated with those keywords. The topic identification system starts out with an initial set of keywords and identifies language patterns that each include a keyword. The topic identification system then uses the identified language patterns to identify additional keywords of the online discussion that match the patterns. The topic identification system then again identifies language patterns using the keywords including the newly identified keywords. The topic identification system may repeat the process of identifying language patterns and keywords until a termination criterion is satisfied.
摘要:
A topic identification system identifies topics of online discussions by iteratively identifying topic words or keywords of the online discussions and identifying language patterns associated with those keywords. The topic identification system starts out with an initial set of keywords and identifies language patterns that each include a keyword. The topic identification system then uses the identified language patterns to identify additional keywords of the online discussion that match the patterns. The topic identification system then again identifies language patterns using the keywords including the newly identified keywords. The topic identification system may repeat the process of identifying language patterns and keywords until a termination criterion is satisfied.
摘要:
A method and system for assessing keyword usage based on frequency of usage of the keywords during various periods is provided. A keyword usage measurement system is provided with the frequency of keywords during various periods. The measurement system then calculates a recent usage score for a keyword by combining a frequency impulse score for the keyword with a frequency weight for the keyword. The frequency impulse score for a keyword indicates whether a recent change in the frequency of the keyword has occurred. The frequency weight for a keyword indicates a recent measure of the frequency of the keyword.
摘要:
A method and system for assessing keyword usage based on frequency of usage of the keywords during various periods is provided. A keyword usage measurement system is provided with the frequency of keywords during various periods. The measurement system then calculates a recent usage score for a keyword by combining a frequency impulse score for the keyword with a frequency weight for the keyword. The frequency impulse score for a keyword indicates whether a recent change in the frequency of the keyword has occurred. The frequency weight for a keyword indicates a recent measure of the frequency of the keyword.
摘要:
A system for determining whether to approve a target document (e.g., advertisement) is provided. The system trains a classifier using tuples of words from appropriate documents and tuples of words from inappropriate documents. To approve a target document, the system identifies tuples of words of the target document. The system then applies the classifier to the identified tuples to classify the document as being appropriate or inappropriate. If the document is classified as appropriate, the system automatically approves the document.
摘要:
A method and system for generating and using a combined model to identify whether a bid term is relevant to an advertisement is provided. A relevance system trains a combined model that includes an initial model and a decision tree model that are trained using features that represent relationships between bid terms and advertisements. The relevance system trains the initial model to map initial model features to a modeled relevance. The relevance system trains the decision tree model to map the decision tree features and the modeled relevance to a final relevance. The trained initial model and decision tree model represent the combined model. The relevance system then uses the combined model to determine the relevance of bid terms to advertisements.
摘要:
Click-through log mining is described. Raw search click-through log data is processed to generate ordered query keywords, utilizing an algorithm to expand user-submitted keywords to include high frequency user queries, managing the keywords for a keyword expansion file, analyzing the algorithm performance on a bidding criteria, and identifying related phrases with similar page-click behaviors for advertisements.
摘要:
Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要:
Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要:
A summary system for evaluating summaries of documents and for generating summaries of documents based on normalized probabilities of portions of the document. A summarization system generates a summary by selecting sentences for the summary based on their normalized probabilities as derived from a document model. An evaluation system evaluates the effectiveness of a summary based on a normalized probability for the summary that is derived from a document model.