A simple cluster finding and document tagging algorihm authored in python.
Gives great results for clustering similar lines in a document.
Uses https://github.com/flylo/g-means for auto clustering with GMeans.
Use it with specific options to get a more tailored result for your data. Usually different cluster sizes imporve accuracy tremendously.