Question-Tagging

Classifying Questions into 5 categories (What, Who, When, Affirmation(Yes/No) and Unknown.

Approach

Text Exploration
Text Cleaning
Obtaing POS Tags, Identifying Named Entities, Lemmas, Syntactic Dependency Relations and Orthographic Features.
Using the obtained properties as features.
Using a Linear SVM model on the engineered features.
Predict Categories of Unseens Data.

Results

Used an 80:20 test/train split to obtain 96.296% accuracy on test set.

Tagged 3000 unseen questions from here

The Tagged.csv files contains the tagged results.

The tagging_model.pkl file contains the trained LinearSVM model.

We tried a combination of features, results are below

Variations in Features Used	Test Set Accuracy
Named Entities, Lemmas, POS Tags, Syntactic Dependency, Orthography	95.96
Named Entities, Lemmas, POS Tags	96.296

References

Classifying What-type Questions by Head Noun Tagging (http://www.aclweb.org/anthology/C08-1061)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LabelledData (1).txt		LabelledData (1).txt
Question Labeling.csv		Question Labeling.csv
Question Tagging.ipynb		Question Tagging.ipynb
README.md		README.md
Tagged.csv		Tagged.csv
Tagging_1000.txt		Tagging_1000.txt
Tagging_3000.txt		Tagging_3000.txt
Train_vaild_total.csv		Train_vaild_total.csv
tagging_model.pkl		tagging_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Question-Tagging

Classifying Questions into 5 categories (What, Who, When, Affirmation(Yes/No) and Unknown.

Approach

Results

Used an 80:20 test/train split to obtain 96.296% accuracy on test set.

Tagged 3000 unseen questions from here

The Tagged.csv files contains the tagged results.

The tagging_model.pkl file contains the trained LinearSVM model.

We tried a combination of features, results are below

References

About

Releases

Packages

Languages

amankedia/Question-Tagging

Folders and files

Latest commit

History

Repository files navigation

Question-Tagging

Classifying Questions into 5 categories (What, Who, When, Affirmation(Yes/No) and Unknown.

Approach

Results

Used an 80:20 test/train split to obtain 96.296% accuracy on test set.

Tagged 3000 unseen questions from here

The Tagged.csv files contains the tagged results.

The tagging_model.pkl file contains the trained LinearSVM model.

We tried a combination of features, results are below

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages