Jupyter notebook for my research in Document Similarity.
This notebook covers my research in document similarity. I have used 2-layer Earth Mover's distance over latent topics and word2vec for getting similar documents. I have compared my approach with doc2vec and jenson-shannon.
The paper has been submitted to ACM's Transactions on Data Science.
The results are semantically better than other approaches but this approach takes a lot of time to compute the similarity matrix.
I'll add a deeper explaination once my paper has been published.