摘要:
A method of ranking search results includes producing a relevance score for a document in view of a query. A similarity score is calculated for the query utilizing a feature vector that characterizes attributes and query words associated with the document. A rank value is assigned to the document based upon the relevance score and the similarity score.
摘要:
This invention includes the step of transmitting a query to a set of search engines. Any result lists returned from these search engines is received, and a subset of entries in each result list is selected. Each entry in this subset is assigned a scoring value according to a scoring function, and each result list is then assigned a representative value according to the scoring values assigned to its entries. A merged list of entries is produced based upon the representative value assigned to each result list.
摘要:
An application caching system and method are provided wherein one or more applications may be cached throughout a distributed computer network. The system may include a central cache directory server, one or more distributed master application servers and one or more distributed application cache servers. The system may permit a service, such as a search, to be provided to the user more quickly.
摘要:
Techniques are provided which improve deal and advertisement targeting of users. Methods and systems may detect if an email contains deal information related to one or more deals. If an email contains deal information, the deal information may be extracted. If the user clicks on a link in the email, one or more additional deals which may be similar or related to the one or more deals received in the email may be selected based at least in part on the extracted deal information. The additional deals and/or advertisements related to the additional deals may be targeted to the user via email or via the user's browser application.
摘要:
Techniques are provided which improve deal and advertisement targeting of users, and which may include facilitating user comparison of deals. Methods and systems may detect if an email contains deal information related to one or more deals. If an email contains deal information, the deal information may be extracted. When the email is opened by the user, a link may be displayed on top of (e.g., overlaid on) the email. The link may be configured such that clicking on the link transmits a search query comprising the extracted deal information to a deal service. The deal service may retrieve one or more additional deals which may be similar or related to the one or more deals received in the email. The additional deals may be selected by the deal service based at least in part on the extracted deal information.
摘要:
Techniques are provided for advertiser bid forecasting in online advertising, including display advertising. Methods are provided in which key targeting-related user segments are determined from bidding statistics. A feature set is extracted from an impression opportunity, based at least in part on the bidding statistics. A gradient boosting descent tree technique is utilized in determining an initial bid forecasting result. A linear regression-based model is used in post-tuning to arrive at a post-tuned result. For short-term forecasting, this may be the final result. For long-term forecasting, a hybrid approach may be utilized with further processing including utilization of a publisher-specific model.
摘要:
A method for sharing content with a user includes receiving from a user a first set of keywords for annotating an annotated user; receiving from the user a second set of keywords that designate whether annotated content annotated by at least one keyword included in the second set of keywords may be shared with the annotated user; storing in a data store a first association of the first set of keywords with the annotated user, and a second association of the second set of keywords with the annotated user; receiving a keyword selection for a select keyword and an identifier for the annotated user; and displaying on the client system content annotated by the select keyword if the annotated user is annotated by at least one keyword in the first set of keywords, and if the select keyword is included in the second set of keywords.
摘要:
The present invention provides a method, system and computer program for naming a cluster, or a hierarchy of clusters, of words and phrases that have been extracted from a set of documents. The invention takes these clusters as the input and generates appropriate labels for the clusters using a lexical database. Naming involves first finding out all possible word senses for all the words in the cluster, using the lexical database; and then augmenting each word sense with words that are semantically similar to that word sense to form respective definition vectors. Thereafter, word sense disambiguation is done to find out the most relevant sense for each word. Definition vectors are clustered into groups. Each group represents a concept. These concepts are thereafter ranked based on their support. Finally, a pre-specified number of words and phrases from the definition vectors of the dominant concepts are selected as labels, based on their generality in the lexical database. Therefore, the labels may not necessarily consist of the original words in the cluster. A hierarchy of clusters is named in a recursive fashion starting from leaf clusters. Dominant concepts in child clusters are propagated into their parent to reduce the labeling complexity of parent clusters.
摘要:
A method, apparatus, and article of manufacture employing lexicon reduction using key characters and a neural network, for recognizing a line of cursive text. Unambiguous parts of a cursive image, referred to as “key characters,” are identified. If the level of confidence that a segment of a line of cursive text is a particular character is higher than a threshold, and is also sufficiently higher than the level of confidence of neighboring segments, then the character is designated as a key character candidate. Key character candidates are then screened using geometric information. The key character candidates that pass the screening are designated key characters. Two-stages of lexicon reduction are employed. The first stage of lexicon reduction uses a neural network to estimate a lower bound and an upper bound of the number of characters in a line of cursive text. Lexicon entries having a total number of characters outside of the bounds are eliminated. For the second stage of lexicon reduction, the lexicon is fitter reduced by comparing character strings using the key characters, with lexicon entries. For each of the key characters in the character strings, it is determined whether there is a mismatch between the key character and characters in a corresponding search range in the lexicon entry. If the number of mismatches for all of the key characters in a search string is greater than (1+(the number of key characters in the search string/4)), then the lexicon entry is eliminated. Accordingly, the invention advantageously accomplishes lexicon reduction, thereby decreasing the time required to recognize a line of cursive text, without reducing accuracy.
摘要:
A multi-stage multi-network character recognition system decomposes the estimation of a posteriori probabilities into coarse-to-fine stages. Classification is then based on the estimated a posteriori probabilities. This classification process is especially suitable for the tasks that involve a large number of categories. The multi-network system is implemented in two stages: a soft pre-classifier and a bank of multiple specialized networks. The pre-classifier performs coarse evaluation of the input character, developing different probabilities that the input character falls into different predefined character groups. The bank of specialized networks, each corresponding to a single group of characters, performs fine evaluation of the input character, where each develops different probabilities that the input character represents each character in that specialized network's respective predefined character group. A network selector is employed to increase the system's efficiency by selectively invoking certain specialized networks selected, using a combination of prior external information and outputs of the pre-classifier. Relative to known single network or one-stage multiple network recognition systems, the invention provides improved recognition, accuracy, confidence measure, speed, and flexibility.