The paper is available at here
This projects explores a document categorization using graphical approach. Data used in this project can be found here Download data here
Consists of 2225 documents from the BBC news website corresponding to stories in five topical areas from 2004-2005. Class Labels: 5 (business, entertainment, politics, sport, tech)
There are five text files - basic_tech.txt,basic_business.txt,basic_poltics.txt,basic_sports.txt
They contain the most five most important words for each class
In order to run the code on test set....
- load the files in load_all manully in spyder. PS.We are trying to provide a better method 😛