摘要:
Disclosed herein is parallel processing of a query, which uses inter-query parallelism in posting list intersections. A plurality of tasks, e.g., posting list intersection tasks, are identified for processing in parallel by a plurality of processing units, e.g., a plurality of processing cores of a multi-core system.
摘要:
Disclosed herein is parallel processing of a query, which uses inter-query parallelism in posting list intersections. A plurality of tasks, e.g., posting list intersection tasks, are identified for processing in parallel by a plurality of processing units, e.g., a plurality of processing cores of a multi-core system.
摘要:
A search engine for finding objects that correspond to a search request, including an input module for receiving a keyword query from a user, and a search module being configured to map the keyword query to the identifiers of objects that semantically match the keyword or the plurality of keywords contained in the keyword query, and to generate a search result that contains a listing of matching object identifiers, is characterized in that the search module is further configured to generate the search result by considering network layer information about the user within the process of mapping the keyword query to identifiers of matching objects, wherein the network layer information include sophisticated information the search module receives from a dedicated entity.
摘要:
A search engine for finding objects that correspond to a search request, including an input module for receiving a keyword query from a user, and a search module being configured to map the keyword query to the identifiers of objects that semantically match the keyword or the plurality of keywords contained in the keyword query, and to generate a search result that contains a listing of matching object identifiers, is characterized in that the search module is further configured to generate the search result by considering network layer information about the user within the process of mapping the keyword query to identifiers of matching objects, wherein the network layer information include sophisticated information the search module receives from a dedicated entity.
摘要:
A method and system for quantifying the quality of search results from a search engine based on cohesion. The method and system include modeling a set of search engine search results as a cluster and measuring the cohesion of the cluster. In an embodiment, the cohesion of the cluster is the average similarity between the cluster elements to a centroid vector. The centroid vector is the average of the weights of the vectors of the cluster. The similarity between the centroid vector and the cluster's elements is the cosine similarity measure. Each document in the set of search results is represented by a vector where each cell of the vector represents a stemmed word. Each cell has a cell value which is the frequency of the corresponding stemmed word in a document multiplied by a weight that takes into account the location of the stemmed word within the document.
摘要:
A method of caching posting lists to a search engine cache calculates the ratios between the frequencies of the query terms in a past query log and the sizes of the posting lists for each term, and uses these ratios to determine which posting lists should be cached by sorting the ratios in decreasing order and storing to the cache those posting lists corresponding to the highest ratio values. Further, a method of finding an optimal allocation between two parts of a search engine cache evaluates a past query stream based on a relationship between various properties of the stream and the total size of the cache, and uses this information to determine the respective sizes of both parts of the cache.
摘要:
A method and system for quantifying the quality of search results from a search engine based on cohesion. The method and system include modeling a set of search engine search results as a cluster and measuring the cohesion of the cluster. In an embodiment, the cohesion of the cluster is the average similarity between the cluster elements to a centroid vector. The centroid vector is the average of the weights of the vectors of the cluster. The similarity between the centroid vector and the cluster's elements is the cosine similarity measure. Each document in the set of search results is represented by a vector where each cell of the vector represents a stemmed word. Each cell has a cell value which is the frequency of the corresponding stemmed word in a document multiplied by a weight that takes into account the location of the stemmed word within the document.
摘要:
A method of caching posting lists to a search engine cache calculates the ratios between the frequencies of the query terms in a past query log and the sizes of the posting lists for each term, and uses these ratios to determine which posting lists should be cached by sorting the ratios in decreasing order and storing to the cache those posting lists corresponding to the highest ratio values. Further, a method of finding an optimal allocation between two parts of a search engine cache evaluates a past query stream based on a relationship between various properties of the stream and the total size of the cache, and uses this information to determine the respective sizes of both parts of the cache.
摘要:
A method of caching the results of a search engine query divides a search engine cache into two parts, controlled and uncontrolled, and determines, through an admission policy, to which part the query results should be cached. In one implementation, the admission policy estimates whether a query is likely to be frequent or infrequent in the future by analyzing various features of the query.
摘要:
In a system for storing and retrieving a plurality of records, the plurality of records associated with a ledger, a client issues read and write requests associated with one of the plurality of records, a plurality of record servers responds to the requests received from the client, and a management server maintains and coordinates, between the client and the record servers, information associated with the ledger, records, and record servers.