摘要:
A system may determine a measure of how a content of a document changes over time, generate a score for the document based, at least in part, on the measure of how the content of the document changes over time, and rank the document with regard to at least one other document based, at least in part, on the score.
摘要:
A system may determine a measure of how a content of a document changes over time, generate a score for the document based, at least in part, on the measure of how the content of the document changes over time, and rank the document with regard to at least one other document based, at least in part, on the score.
摘要:
A system may determine a measure of how a content of a document changes over time, generate a score for the document based, at least in part, on the measure of how the content of the document changes over time, and rank the document with regard to at least one other document based, at least in part, on the score.
摘要:
Provided is a method and system for indexing documents in a collection of linked documents. A link log, including one or more pairings of source documents and target documents is accessed. A sorted anchor map, containing one or more target document to source document pairings, is generated. The pairings in the sorted anchor map are ordered based on target document identifiers.
摘要:
Provided is a method and system for indexing documents in a collection of linked documents. A link log, including one or more pairings of source documents and target documents is accessed. A sorted anchor map, containing one or more target document to source document pairings, is generated. The pairings in the sorted anchor map are ordered based on target document identifiers.
摘要:
Provided is a method and system for indexing documents in a collection of linked documents. A link log, including one or more pairings of source documents and target documents is accessed. A sorted anchor map, containing one or more target document to source document pairings, is generated. The pairings in the sorted anchor map are ordered based on target document identifiers.
摘要:
Provided is a method and system for indexing documents in a collection of linked documents. A link log, including one or more pairings of source documents and target documents is accessed. A sorted anchor map, containing one or more target document to source document pairings, is generated. The pairings in the sorted anchor map are ordered based on target document identifiers.
摘要:
A system provides client access to customized news content. The system includes a custom news source server and a news search server. The custom news source server periodically sends one or more customized search queries to a news search server. The news search server fetches news content from multiple news source servers and aggregates the news content. The news search server also periodically receives the one or more search queries from the custom news source server, searches the aggregated news content based on the one or more search queries, and periodically provides selected news content to the custom news server based on results of the searches. The custom news source server permits access to clients, from across a network, to the selected news content provided by the news search server.
摘要:
A system of reducing the possibility of crawling duplicate document identifiers partitions a plurality of document identifiers into multiple clusters, each cluster having a cluster name and a set of document parameters. The system generates an equivalence rule for each cluster of document identifiers, the rule specifying which document parameters associated with the cluster are content-relevant. Next, the system groups each cluster of document identifiers into one or more equivalence classes in accordance with its associated equivalence rule, each equivalence class including one or more document identifiers that correspond to a document content and having a representative document identifier identifying the document content.
摘要:
A method of distributing files operates in a system having a master and a plurality of slaves, interconnected by a communications network. Each slave determines a current file length for each of a plurality of files and sends slave status information to the master, the slave status information including the current file length for each file. The master schedules copy operations based on the slave status information. The master stores bandwidth capability information indicating data transmission bandwidth capabilities for the resources required to transmit data between the slaves, and also stores bandwidth usage information indicating a total allocated bandwidth for each resource. For each schedule copy operation, an amount of data transmission bandwidth is allocated and the stored bandwidth usage information is updated accordingly. The master only schedules copy operations that do not cause the total allocated bandwidth of any resource to exceed the bandwidth capability of that resource.