摘要:
Systems and methods for enhanced document retrieval are described. In one aspect, a search query from an end-user is received. Responsive to receiving the search query, search results are retrieved. The search results include an enhanced document and a set of non-enhanced documents. The enhanced document and the non-enhanced documents include term(s) of the search query. The enhanced document is derived from a base document. The base document was modified with metadata mined from one or more different documents. The metadata is associated with one or more respective references to the base document. The one or more different documents are independent of the base document.
摘要:
Systems and methods for verifying relevance between terms and Web site contents are described. In one aspect, site contents from a bid URL are retrieved. Expanded term(s) semantically and/or contextually related to bid term(s) are calculated. Content similarity and expanded similarity measurements are calculated from respective combinations of the bid term(s), the site contents, and the expanded terms. Category similarity measurements between the expanded terms and the site contents are determined in view of a trained similarity classifier. The trained similarity classifier having been trained from mined web site content associated with directory data. A confidence value providing an objective measure of relevance between the bid term(s) and the site contents is determined from the content, expanded, and category similarity measurements evaluating the multiple similarity scores in view of a trained relevance classifier model.
摘要:
A method and system for adapting search results of a query to the information needs of the user submitting the query is provided. A search system analyzes click-through triplets indicating that a user submitted a query and that the user selected a document from the results of the query. To overcome the large size and sparseness of the click-through data, the search system when presented with an input triplet comprising a user, a query, and a document determines a probability that the user will find the input document important by smoothing the click-through triplets. The search system then orders documents of the result based on the probability of their importance to the input user.
摘要:
A method and system for determining similarity between items is provided. To calculate similarity scores for pairs of items, the similarity system initializes a similarity score for each pair of objects and each pair of features. The similarity system then iteratively calculates the similarity scores for each pair of objects based on the similar scores of the pairs of features calculated during a previous iteration and calculates the similarity scores for each pair of features based on the similarity scores of the pairs of objects calculated during a previous iteration. The similarity system implements an algorithm that is based on a recursive definition of the similarities between objects and between features. The similarity system continues the iterations of recalculating the similarity scores until the similarity scores converge on a solution.
摘要:
The described systems, methods and data structures are directed to ranking Web pages with hierarchical considerations. The hierarchical structures and the linking relationships of the World Wide Web are used to provide a page importance ranking for Web searches. The linking relationships are aggregated to a high level node at each of the hierarchical structures. A link graph analysis is performed on the aggregated linking relationships to determine the importance of each node. The importance of each node may be propagated to pages associated with that node. For each page, the importance of that page and the importance of the node associated with the page are used to calculate the page importance ranking.
摘要:
A method and system for detecting whether an outgoing communication contains confidential information or other target information is provided. The detection system is provided with a collection of documents that contain confidential information, referred to as “confidential documents.” When the detection system is provided with an outgoing communication, it compares the content of the outgoing communication to the content of the confidential documents. If the outgoing communication contains confidential information, then the detection system may prevent the outgoing communication from being sent outside the organization. The detection system detects confidential information based on the similarity between the content of an outgoing communication and the content of confidential documents that are known to contain confidential information.
摘要:
A method and system for ranking objects based on relationships with objects of a different object type is provided. The ranking system defines an equation for each attribute of each type of object. The equations define the attribute values and are based on relationships between the attribute and the attributes associated with the same type of object and different types of objects. The ranking system iteratively calculates the attribute values for the objects using the equations until the attribute values converge on a solution. The ranking system then ranks objects based on attribute values.
摘要:
Systems and methods for enhanced document retrieval are described. In one aspect, a search query from an end-user is received. Responsive to receiving the search query, search results are retrieved. The search results include an enhanced document and a set of non-enhanced documents. The enhanced document and the non-enhanced documents include term(s) of the search query. The enhanced document is derived from a base document. The base document was modified with metadata mined from one or more different documents. The metadata is associated with one or more respective references to the base document. The one or more different documents are independent of the base document.
摘要:
A method and system for detecting whether an outgoing communication contains confidential information or other target information is provided. The detection system is provided with a collection of documents that contain confidential information, referred to as “confidential documents.” When the detection system is provided with an outgoing communication, it compares the content of the outgoing communication to the content of the confidential documents. If the outgoing communication contains confidential information, then the detection system may prevent the outgoing communication from being sent outside the organization. The detection system detects confidential information based on the similarity between the content of an outgoing communication and the content of confidential documents that are known to contain confidential information.
摘要:
A peer-to-peer advertisement platform is provided to ubiquitously promote products or services supplied by advertisers across content-based applications executing on nodes in a peer-to-peer network. The peer-to-peer advertisement platform may include a registration component to register nodes in the peer-to-peer advertising platform, an advertisement submission component to receive advertisement data from the advertisers, and a distribution component to distribute the advertisement data to the nodes registered in the peer-to-peer advertisement platform. The peer-to-peer advertisement platform also includes a money sharing component to reward nodes based on a contribution level assigned to the node. Accordingly, the peer-to-peer advertisement platform stores the advertisement data locally at the plurality of nodes registered in the peer-to-peer advertising platform and shares a portion of the revenue generated from the advertisement data with the nodes registered in the peer-to-peer advertising platform.