摘要:
The claimed subject matter provides a system and/or a method that facilitates reducing spam in search results. An interface can obtain web graph information that represents a web of pages. A spam detection component can determines one or more features based at least in part on the web graph information. The one or more features can provide indications that a particular page of the web graph is spam. In addition, a robust rank component is provided that limits amount of contribution a single page can provide to the target page.
摘要:
The claimed subject matter provides a system and/or a method that facilitates reducing spam in search results. An interface can obtain web graph information that represents a web of pages. A spam detection component can determines one or more features based at least in part on the web graph information. The one or more features can provide indications that a particular page of the web graph is spam. In addition, a robust rank component is provided that limits amount of contribution a single page can provide to the target page.
摘要:
A system and/or methodology that exploits user interaction within a social network in order to derive profits. The invention provides for increased flow of money through a social network, and simultaneously allows advertisers and merchants to focus their advertising spending within the social network. Additionally, the invention provides for quantitative measurement of the effects of relational proximity marketing /advertising (RPM), and creates incentives for users to purchase goods through the social network.
摘要:
The claimed subject matter relates to an architecture that can identify, store, and/or output local contributions to a rank of a vertex in a directed graph. The architecture can receive a directed graph and a parameter, and examine a local subset of vertices (e.g., local to a given vertex) in order to determine a local supporting set. The local supporting set can include a local set of vertices that each contributes a minimum fraction of the parameter to a rank of the vertex. The local supporting set can be the basis for an estimate of the supporting set and/or rank of the vertex for the entire graph and can be employed as a means for detecting link or web spam as well as other influence-based social network applications.
摘要:
The claimed subject matter relates to an architecture that can identify, store, and/or output local contributions to a rank of a vertex in a directed graph. The architecture can receive a directed graph and a parameter, and examine a local subset of vertices (e.g., local to a given vertex) in order to determine a local supporting set. The local supporting set can include a local set of vertices that each contributes a minimum fraction of the parameter to a rank of the vertex. The local supporting set can be the basis for an estimate of the supporting set and/or rank of the vertex for the entire graph and can be employed as a means for detecting link or web spam as well as other influence-based social network applications.
摘要:
An improved system and method is provided for identifying web communities from seed sets of web pages. A seed set of web pages may be represented as a set of seed vertices of a graph representing a collection of web pages. An initial probability distribution may be constructed on vertices of the graph by assigning a nonzero value to the vertices belonging to the seed set. Then a sequence of probability distributions may be produced on the vertices of the graph by modifying the probability distribution over a series of one-step walks of the probability distribution over the vertices of the graph. For each probability distribution produced in the sequence, level sets of vertices may be generated, and a level set with minimal conductance may be selected for each probability distribution. The level set with the least conductance may then be output representing a community of web pages.
摘要:
Methods and apparatus for locating a dense and isolated sub-graph from a weighted graph having multiple nodes and multiple weighted edges are described. Each node in the weighted graph represents an object. Each weighted edge in the weighted graph connects two nodes and represents the relationship between the two objects represented by the two corresponding nodes. To located the sub-graph, first, an auxiliary weighted graph is constructed using the weighted graph and three coefficients: α, β, and γ, where α, β, and γ are greater than 0, α influences the number of nodes inside the sub-graph, β influences the sum of the weights associated with the edges connecting a node inside the sub-graph and a node outside the sub-graph, and γ influences the sum of the weights associated with the edges connecting two nodes both inside the sub-graph, and by adding a source node s and a sink node t. Next, the auxiliary weighted graph is partitioned into two parts using the s-t minimum cut algorithm. The sub-graph is the part associated with the sink node t in its original form, with the original undirected edges and unmodified edge weights and excluding the sink node t and all the new edges added during the construction of the auxiliary weighted graph.
摘要:
Providing for local graph partitioning using an evolving set process is disclosed herein. By way of example, a computer processor can be configured to execute local partitioning based on evolving set instructions. The instructions can be employed to transition a set of analyzed vertices of a graph until a segment of the graph with small conductance is identified. A transitioning algorithm can expand or contract the analyzed set of vertices based on characteristics of vertices at a boundary of the analyzed set. Accordingly, as the set of analyzed vertices becomes large, significant processing efficiency is gained by employing the characteristics of boundary vertices to transition the set or determine conductance, rather than all vertices of the analyzed set.
摘要:
Arrangements are provided for efficient erasure coding of files to be distributed and later retrieved from a peer-to-peer network, where such files are broken up into many fragments and stored at peer systems. The arrangements further provide a routine to determine the probability that the file can be reconstructed. The arrangements further provide a method of performing the erasure coding in an optimized fashion, allowing fewer occurrences of disk seeks.
摘要:
An improved system and method is provided for identifying web communities from seed sets of web pages. A seed set of web pages may be represented as a set of seed vertices of a graph representing a collection of web pages. An initial probability distribution may be constructed on vertices of the graph by assigning a nonzero value to the vertices belonging to the seed set. Then a sequence of probability distributions may be produced on the vertices of the graph by modifying the probability distribution over a series of one-step walks of the probability distribution over the vertices of the graph. For each probability distribution produced in the sequence, level sets of vertices may be generated, and a level set with minimal conductance may be selected for each probability distribution. The level set with the least conductance may then be output representing a community of web pages.