-
Notifications
You must be signed in to change notification settings - Fork 331
pagerank
PageRank is an algorithm used to estimate how important a vertex is in the graph. It was named after Larry Page, the inventor of the algorithm, who is also the one of the founders of Google. First, it was invented to rank web pages which are linked to others on world wide web. The basic idea of PageRank is that it estimates the importance of the page by counting the number and quality of hyperlinks to this page. The motivation of this idea is that more important web page are likely to receive more links from other pages. The algorithm intrinsically abstract a giant directed graph from the world wide web. While the edges of the graph are converted by a defined Markov process to a transition probability matrix. The result of algorithm is the solution of stationary distribution of the Markov process. More details in latest version of algorithm includes the random surfer and smallest value setting on Wikipedia page. (https://en.wikipedia.org/wiki/PageRank ) We can find the important node in the graph by this algorithm. The algorithm works by estimating probability on each graph node and approximating to the stationary distribution of Markov process in an iteration procedure. The estimated presicion of result are strongly determinated by the round of algorithm iteration. The stopping time of algorithm can also be controlled by the convergence threshold parameter. Undirected graph edge will be treated as two-way directed edge in algorithm implementation. (Paper: The PageRank Citation Ranking: Bringing Order to the Web - http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf )
use --help
param to view detailed help information.
Input files should be formatted as follows:
<src>,<dst>
where <src>
and <dst>
are integers of type uint32_t
, representing the end nodes of an edge.
Note that Plato treats every input graph as undirected by default. For a directed graph, please ensure both <A, B> and <B, A> appear in the input file if they exist. Edges that appear more than once will be considered as multiple edges between the same pair of nodes.
Input example (Following numbers are synthetic and are for demonstration purpose only.):
4564,823192
823192,973033
1713,823192
Output files are formatted as follows:
<vertex_id>,<pagerank_value>
where <vertex_id>
represents a node and <pagerank_value>
gives the PageRank value of node.
Output example (Following numbers are synthetic and are for demonstration purpose only.):
4564,0.2552393450530435
823192,0.5215760529282608
https://github.com/Tencent/plato/blob/master/plato/algo/pagerank
- Graph Attributes
- Tree Depth/Width
- Graph Attributes All-in-One: Number of Nodes/Edges, Density, Degree Distribution
- N-step Degrees
- HyperANF
- Node Centrality Metrics
- Connectivity & Community Discovery
- Graph Representation Learning
- Clustering/Unfolding Algorithms
- Other Graph Algorithms
Algorithms to open source:
- Network Embedding
- LINE
- Word2Vec
- GraphVite
- GNN
- GCN
- GraphSage