摘要:
The present invention provides a system and method for inferring information need in a collection of hypermedia documents that is based on the observation that a user's hypertext link traversal decisions are typically based on the nature of that user's information need. The system identifies the hypermedia linkage structure among the plurality of documents in the collection. The documents include content items that may be relevant to a user information need. The system then accepts a user path item that represents a user's hypermedia link traversal history and applies a network flow model to the user path item in the hypermedia link information in order to create a document vector. The system also determines the distribution of the content items in the document collection, and then compares the document vector to the content item distribution in order to determine an inferred information need.
摘要:
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
摘要:
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
摘要:
The present invention also provides a system and method for predicting user traffic flow in a collection of hypermedia documents by determining the association strength of the hypermedia links. Hypermedia links are identified among a plurality of documents, where the documents include content items such as keywords that may or may not be relevant to a user information need. The distribution of the content items in the document collection is then determined. An information item is received as input, and is compared to the content items. In response to the comparison, association strengths are assigned to the hypermedia links. A network flow model uses the association strengths of the hypermedia links to predict user traffic flow in response to an initial condition.
摘要:
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
摘要:
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
摘要:
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
摘要:
A method is presented for determining whether to prefetch and cache documents on a computer. In one embodiment, documents are prefetched and cached on a client computer from servers located on the Internet in accordance with their computed need probability. Those document with a higher need probability are prefetched and cached before documents with lower need probabilities. The need probability for a document is computed using both a document context factor and a document history factor. The context factor of the need probability of a document is determined by computing the correlation between words in the document and a context Q of the operating environment. The history factor of the need probability of a document is determined by integrating both the recency of document use and the frequency of document use.
摘要:
A method and apparatus for identifying related documents in a collection of linked documents. In the method the link structure of documents to other documents are analyzed. By analyzing only the link structure, a process intensive content analysis of the documents is avoided. A citation analysis technique, such as bibliographic coupling analysis, is performed on the set of documents to extract link information. For bibliographic coupling analysis that information would include the number of other documents that a given pair of documents link to. By using the link information, related documents are identified using a suitable analysis technique, such as clustering or spreading activation.
摘要:
A system for extracting and analyzing information from a collection of linked documents at a locality to enable categorization of documents and prediction of documents relevant to a focus document. The system obtains and analyzes topology, usage and path information from for a collection at a locality, e.g. a web locality on the world wide web. For categorization, document meta information is represented as document vectors. Predefined criteria is applied to the document vectors to create lists of "similar" types of documents. For relevance prediction, networks representing topology, usage path and text similarity amongst the documents in the collection are created. A spreading activation technique is applied to the networks starting at a focus document to predict the documents relevant to the focus document. Using category and relevance prediction information, tools can be built to enable a user to more efficiently traverse through the collection of linked documents.