Skip to content

ankushbhatia2/Document_similarity_research_notebook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Document_similarity_research_notebook

Jupyter notebook for my research in Document Similarity.

This notebook covers my research in document similarity. I have used 2-layer Earth Mover's distance over latent topics and word2vec for getting similar documents. I have compared my approach with doc2vec and jenson-shannon.

The paper has been submitted to ACM's Transactions on Data Science.

The results are semantically better than other approaches but this approach takes a lot of time to compute the similarity matrix.

I'll add a deeper explaination once my paper has been published.