Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for associating resources with entities. One of the methods includes clustering a plurality of first documents into one or more first document groups, wherein each of the one or more first document groups is associated with a proper name of an author; receiving a query that specifies a particular proper name of a particular author; generating a result list of one or more documents that satisfy the query, the documents being listed in order of rank; ranking the one or more first document groups based on the one or more documents that satisfy the query; and providing the one or more first document groups, wherein the one or more first document groups are presented in an order based on the ranking.
Abstract:
Provided is a method and system for indexing documents in a collection of linked documents. A link log, including one or more pairings of source documents and target documents is accessed. A sorted anchor map, containing one or more target document to source document pairings, is generated. The pairings in the sorted anchor map are ordered based on target document identifiers.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for associating resources based on resource associations. One of the methods includes receiving a first profile, wherein the first profile is for a first author, wherein the first profile links to one or more first documents, wherein the first author is an author of each of the one or more first documents; identifying, one or more second authors, wherein each of the one or more second authors is a co-author of one or more of the first documents; calculating, respective co-author scores for each of the one or more second authors; ranking, the one or more second authors based on their respective co-author scores; and associating, the one or more second authors with the first profile, the first profile includes a listing of the one or more second authors in an order according to the ranking.
Abstract:
A system and method identifies a primary version out of different versions of the same document. The system selects a priority of authority for each document version based on a priority rule and information associated with the document version, and selects a primary version based on the priority of authority and information associated with the document version.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for associating resources based on resource associations. One of the methods includes receiving a first profile, wherein the first profile is for a first author, wherein the first profile links to one or more first documents, wherein the first author is an author of each of the one or more first documents; identifying, one or more second authors, wherein each of the one or more second authors is a co-author of one or more of the first documents; calculating, respective co-author scores for each of the one or more second authors; ranking, the one or more second authors based on their respective co-author scores; and associating, the one or more second authors with the first profile, the first profile includes a listing of the one or more second authors in an order according to the ranking.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for associating resources with entities. One of the methods includes clustering a plurality of first documents into one or more first document groups, wherein each of the one or more first document groups is associated with a proper name of an author; receiving a query that specifies a particular proper name of a particular author; generating a result list of one or more documents that satisfy the query, the documents being listed in order of rank; ranking the one or more first document groups based on the one or more documents that satisfy the query; and providing the one or more first document groups, wherein the one or more first document groups are presented in an order based on the ranking.
Abstract:
Systems and method are provided for setting a respective reuse flag for a corresponding document in a plurality of documents based on a query-independent score associated with the corresponding document. A document crawling operation is performed on the plurality of documents in accordance with the reuse flag for respective documents in the plurality of documents. This document crawling operation includes reusing a previously downloaded version of a respective document in the plurality of documents instead of downloading a current version of the respective document from a host computer in accordance with a determination that the reuse flag associated with the respective document meets a predefined criterion.
Abstract:
The disclosed implementations provide a method of searching for a known item. The method includes receiving a lookup request for the known item. The lookup request includes information identifying estimated values for a plurality of attributes of the known item. In accordance with the received lookup request, two or more estimated attribute-value pairs for the known item are estimated. In accordance with the received lookup request, a plurality of queries corresponding to the estimated attribute-value pairs are formulated in accordance with a plurality of predefined query types, each query having a corresponding position in a query type hierarchy. One or more candidate items are identified by executing one or more of the plurality of queries in accordance with the query type hierarchy. At least one of the candidate items is returned in response to the lookup request for the known item.
Abstract:
Systems and method are provided for setting a respective reuse flag for a corresponding document in a plurality of documents based on a query-independent score associated with the corresponding document. A document crawling operation is performed on the plurality of documents in accordance with the reuse flag for respective documents in the plurality of documents. This document crawling operation includes reusing a previously downloaded version of a respective document in the plurality of documents instead of downloading a current version of the respective document from a host computer in accordance with a determination that the reuse flag associated with the respective document meets a predefined criterion.
Abstract:
A system customizes a news document associated with a user of a news aggregation service. The system includes multiple news source servers that store news content and a remote news aggregation server. The news aggregation server creates a customized news document based on one or more personalized search queries received from a user. The news aggregation server fetches the news content from the multiple news source servers, aggregates the news content, and searches the aggregated news content based on the one or more personalized search queries. The news aggregation server provides selected news content to the customized news document based on results of the search.