摘要:
Information with respect to users, items, and interactions between the users and items is collected. Each user is associated with a set of user features. Each item is associated with a set of item features. An expected score function is defined for each user-item pair, which represents an expected score a user assigns an item. An objective represents the difference between the expected score and the actual score a user assigns an item. The expected score function and the objective function share at least one common variable. The objective function is minimized to find best fit for some of the at least one common variable. Subsequently, the expected score function is used to calculate expected scores for individual users or clusters of users with respect to a set of items that have not received actual scores from the users. The set of items are ranked based on their expected scores.
摘要:
Content display policies are evaluated using two kinds of methods. In the first kind of method, using information, collected in a “controlled” manner about user characteristics and content characteristics, truth models are generated. A simulator replays users' visits to the portal web page and simulates their interactions with content items on the page based on the truth models. Various metrics are used to compare different content item-selecting algorithms. In the second kind of method, no explicit truth models are built. Events from the controlled serving scheme are replayed in part or whole; content item-selection algorithms learn using the observed user activities. Metrics that measure the overall predictive error are used to compare different content-item selection algorithms. The data collected in a controlled fashion plays a key role in both the methods.
摘要:
A unified database/text retrieval system converts exact database type queries into text inclusion type queries suitable for text retrieval systems through the use of pseudo keywords. Boolean combination of the text inclusion type query elements may be readily manipulated for optimization and applied to a unified index for rapid search results. Absolute relevance values and relevance multiplier values may be added to the query elements to provide a relevance-based sorting not only of text but also of exact match type search results. Relevance values may be deduced automatically from a variety of sources.
摘要:
A method of creating and updating a binary decision tree from training databases that cannot be fit in high speed solid state memory is provided in which a subset of the training database which can fit into high speed memory is used to create a statistically good estimate of the binary decision tree desired. This statistically good estimate is used to review the entire training database in as little as one sequential scan to collect statistics necessary to verify the accuracy of the binary decision tree and to refine the binary decision tree to be identical to that which would be obtained by a full analysis of the training database.