-
Notifications
You must be signed in to change notification settings - Fork 0
shayantabrizi/PPC
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
----------------------------------------------------------------------------- PPC: A scalable graph clustering algorithm Based on the article "Personalized PageRank Clustering: A Graph Clustering Algorithm Based on Random Walks" Authors: S.A. Tabrizi, A. Shakery, M. Asadpour, M. Abbasi, M.A. Tavallaie Version: 1.3 ----------------------------------------------------------------------------- Code Author : M. Abbasi Email : ma.abbasi@ut.ac.ir University : University of Tehran, Iran ----------------------------------------------------------------------------- Please cite: Shayan A. Tabrizi, Azadeh Shakery, Masoud Asadpour, Maziar Abbasi, Mohammad Ali Tavallaie, Personalized PageRank Clustering: A graph clustering algorithm based on random walks, Physica A: Statistical Mechanics and its Applications, vol. 392, no. 22, pp. 5772-5785, http://dx.doi.org/10.1016/j.physa.2013.07.021 Disclaimer: We really appreciate bug reports and any suggestions for improving the algorithm. Please contact us via email at s.tabrizi@ut.ac.ir. Usage: The second argument is optional and changes the random number generator seed. 1) For simple undirected graphs: ./clusterer test 0 clusterer reads Datasets/test/test.txt test.txt consists of the network edges in the format of "src dst" 2) For weighted undirected graphs: ./weclusterer test 0 ./weintclusterer test 0 weintclusterer is optimized for integer edge weights. weclusterer (weintclusterer) reads Datasets/test/test.txt test.txt consists of network edges in the format of "src dst weight" In case of conflicts, e.g. "2 5 2.5", "2 5 3.5" and "5 2 3", they consider "2 5 3.5", in which the weight is the maximum of the input weights. Program automatically generates Datasets/test/test.smg for future use and it might take more time in the first run. In the future releases, this part should be more optimized. if you have a .smg file, you don't need the .txt file anymore. if you like to compile the program manually, here is what you need: -DcompLimit=40 you can set a limit for the number of clusters -DexactClustersCount using it in conjunction with -DcompLimit produces exactly compLimit clusters (Useful for when the number of clusters is known) (compLimit must be not higher than the number of vertices) -DpageRankAlpha=0.7 to change the PageRank alpha value -DTRIAL_DEFAULT=50 minimum number of random jumps starting from a vertex -DTRIAL_DEFUALT_MULT=10 number of random jumps starting from a vertex = max{TRIAL_DEFUALT_MULT * node's degree, DTRIAL_DEFAULT} -DHA creates .tree output file -Dhier creates .tree output file: with hierarchy -> Datasets/football/hier_football.tree -Dcreategraphml creates .graphml output file -DWE for weighted graphs with float weights -DWE -DintWeight for weighted graphs with integer weights ----------------------------------------------------------------------------- Change Log: Release 1.3: Bug Fixes: Fixed yet another bug in reading big graph files. Improvements: Moving a node from non-seeds to seeds only if it is connected to at least one seed node. Release 1.2: Bug Fixes: Fixed a bug in reading big graph files (more than about 60000 vertices). Release 1.1: New Features: Added support for determining the exact number of clusters. Bug Fixes: Fixed some issues. Release 1.0
About
Personalized PageRank Clustering: A Graph Clustering Algorithm Based on Random Walks
Resources
Stars
Watchers
Forks
Packages 0
No packages published