摘要:
A method for facilitating development of a document classification function comprises selecting a feature of a document, the feature being less than an entirety of the document; presenting the feature to a human subject; asking the human subject for a feature relevance value of the feature; and generating a classification function using the feature relevance value. The method may also include the steps of presenting the document to the human subject at the same time as presenting the feature; asking the human subject for document relevance value that measures relevance of the document to a category; and wherein the generating the classification function also uses the document relevance value.
摘要:
A method for automatically extracting and organizing information by a processing device from a plurality of data sources is provided. A natural language processing information extraction pipeline that includes an automatic detection of entities is applied to the data sources. Information about detected entities is identified by analyzing products of the natural language processing pipeline. Identified information is grouped into equivalence classes containing equivalent information. At least one displayable representation of the equivalence classes is created. An order in which the at least one displayable representation is displayed is computed. A combined representation of the equivalence classes that respects the order in which the displayable representation is displayed is produced.
摘要:
Techniques for improving advertisement relevance for sponsored search advertising. The method includes steps for processing a click history data structure containing at least a plurality of query-advertisement pairs, populating a first translation table containing a co-occurrence count field, populating a second translation table containing an expected clicks field, and calculating a click propensity score for an advertisement using the click history data structure, the first translation table (for determining overall click likelihood across all historical traffic), and using the second translation table (for removing biases present in the first translation table). Other method steps calculate a second click propensity score for a second advertisement, then ranking the first advertisement relative to the second advertisement for comparing a click propensity score to a threshold for filtering low quality ad candidates from a plurality of ad candidates, and then ranking advertisements for optimizing placement of ads on a sponsored search display page.
摘要:
An improved system and method for identifying context-dependent term importance of queries is provided. A query term importance model is learned using supervised learning of context-dependent term importance for queries and is then applied for advertisement prediction using term importance weights of query terms as query features. For instance, a query term importance model for query rewriting may predict rewritten queries that match a query with term importance weights assigned as query features. Or a query term importance model for advertisement prediction may predict relevant advertisements for a query with term importance weights assigned as query features. In an embodiment, a sponsored advertisement selection engine selects sponsored advertisements scored by a query term importance engine that applies a query term importance model using term importance weights as query features and inverse document frequency weights as advertisement features to assign a relevance score.
摘要:
An improved system and method for identifying context-dependent term importance of queries is provided. A query term importance model is learned using supervised learning of context-dependent term importance for queries and is then applied for advertisement prediction using term importance weights of query terms as query features. For instance, a query term importance model for query rewriting may predict rewritten queries that match a query with term importance weights assigned as query features. Or a query term importance model for advertisement prediction may predict relevant advertisements for a query with term importance weights assigned as query features. In an embodiment, a sponsored advertisement selection engine selects sponsored advertisements scored by a query term importance engine that applies a query term importance model using term importance weights as query features and inverse document frequency weights as advertisement features to assign a relevance score.