摘要:
A logical directory ranking system ranks documents or web pages utilizing logical directories. The present system groups together compound documents as a single information node with one or more leaves, constructing a logical directory graph. URLs can be grouped at a level of granularity below an individual directory. For example, the URLs may be grouped together on the basis of hostname, domain, or any level of the hierarchy of the URLs. Edges in the logical directory graph are formed by links between the logical directories. Edges have weights corresponding to the number of links between logical directories. Nodes have weights corresponding to the number of web pages or leaves represented by a node. A ranking level is determined for each node as a function of the node weight and the edge weight. The ranking level is then applied to each URL that the node represents.
摘要:
A dangling web page processing system ranks dangling web pages on the web. The system ranks dangling web pages of high quality that cannot be crawled by a crawler. In addition, the system adjusts ranks to penalize dangling web pages that return errors when links on the dangling web pages are crawled. By providing a rank for dangling web pages, the present system allows the concentration of crawling resources on those dangling web pages that have the highest rank in the uncrawled region. The system operates locally to the dangling web pages, providing efficient determination of ranks for the dangling web pages. The system explicitly discriminates against web pages on the basis of whether they point to penalty pages, i.e., pages that return an error when a link is followed. By incorporating more fine-grained information such as this into ranking, the system can improve the quality of individual search results and better manage resources for crawling.
摘要:
A dangling web page processing system ranks dangling web pages on the web. The system ranks dangling web pages of high quality that cannot be crawled by a crawler. In addition, the system adjusts ranks to penalize dangling web pages that return errors when links on the dangling web pages are crawled. By providing a rank for dangling web pages, the present system allows the concentration of crawling resources on those dangling web pages that have the highest rank in the uncrawled region. The system operates locally to the dangling web pages, providing efficient determination of ranks for the dangling web pages. The system explicitly discriminates against web pages on the basis of whether they point to penalty pages, i.e., pages that return an error when a link is followed. By incorporating more fine-grained information such as this into ranking, the system can improve the quality of individual search results and better manage resources for crawling.
摘要:
A modular scoring system using rank aggregation merges search results into an ordered list of results using many different features of documents. The ranking functions of the present system can easily be customized to the needs of a particular corpus or collection of users such as an intranet. Rank aggregation is independent of the underlying score distributions between the different factors, and can be applied to merge any set of ranking functions. Rank aggregation holds the advantage of combining the influence of many different heuristic factors in a robust way to produce high-quality results for queries. The modular scoring system combines factors such as indegree, page ranking, URL length, proximity to the root server of an intranet, etc, to form a single ordering on web pages that closely obeys the individual orderings, but also mediates between the collective wisdom of individual heuristics.
摘要:
A Web server stores a table of Web page inlinks. When a Web page is accessed and a user wants to access other pages related to the accessed page, the user requests the table of inlinks, and from it generates a list of sibling links to the accessed page, the sibling links being outlinks of one or more of the inlinks in the table.
摘要:
An optimal path selection system extracts a connection subgraph in real time from an undirected, edge-weighted graph such as a social network that best captures the connections between two nodes of the graph. The system models the undirected, edge-weighted graph as an electrical circuit and solves for a relationship between two nodes in the undirected edge-weighted graph based on electrical analogues in the electric graph model. The system optionally accelerates the computations to produce approximate, high-quality connection subgraphs in real time on very large (disk resident) graphs. The connection subgraph is constrained to the integer budget that comprises a first node, a second node and a collection of paths from the first node to the second node that maximizes a “goodness” function g(H). The goodness function g(H) is tailored to capture salient aspects of a relationship between the first node and the second node.
摘要:
A digital broadcast system provides secure transmission of digital programs to in-home digital devices even when some of the devices are unauthorized. A matrix of device keys S.sub.j,i is provided, wherein "i" is a key index variable indicating a position in a key dimension of the matrix and "j" is a sets index variable indicating a position in a sets dimension of the matrix. Each in-home device is assigned plural device keys from the matrix, with one and only one device key for each key index variable "i" being assigned to a device. To generate a session key for a broadcast program, session numbers x.sub.i are encrypted with all device keys S.sub.j,i to generate a session key block which is decrypted by the in-home devices and used to generate a session key for decrypting the program. If one of the devices is a compromised device, at least one of the session numbers is a dummy number that is encrypted and decrypted by the corresponding compromised device key, with the resulting session key being useless in decrypting the program.
摘要:
In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from expert rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.
摘要:
A digital broadcast system provides secure transmission of digital programs to in-home digital devices even when some of the devices are unauthorized. A matrix of device keys Sj,i is provided, wherein “i” is a key index variable indicating a position in a key dimension of the matrix and “j” is a sets index variable indicating a position in a sets dimension of the matrix. Each in-home device is assigned plural dev ice keys from the matrix, with one and only one device key for each key index variable “i” being assigned to a device. To generate a session key for a broadcast program, session numbers xi are encrypted with all device keys Sj,i to generate a session key block which is decrypted by the in-home devices and used to generate a session key for decrypting the program. If one of the devices is a compromised device, at least one of the session numbers is a dummy number that is encrypted and decrypted by the corresponding compromised device key, with the resulting session key being useless in decrypting the program.
摘要:
A method for assessing sharing of items within a social network is provided. The method includes identifying a first sharing of a social item by a first user of a social network, determining one or more second sharings of the social item by one or more second users, the one or more second sharings being based on the first sharing. The method also includes determining a sharing score associated with the first user based on a number of the one or more second sharings, and updating a data structure based on the determined sharing score associated with the first user. The data structure stores respective sharing scores associated with the plurality of users of the social network. Systems and machine-readable media are also provided.