Skip to content

Latest commit

 

History

History
34 lines (26 loc) · 978 Bytes

README.md

File metadata and controls

34 lines (26 loc) · 978 Bytes

striped-bass

The Striped Bass Clickbait Detector

Used Libraries

  • Python v 3.7.2

Additional python libraries

  • xgboost v. 0.82
  • sklearn v. 0.20.3
  • pandas v. 0.24.1
  • nltk v. 3.4
  • vaderSentiment v. 3.2.1
  • requests v. 2.21.0

Training

python train.py trainDir

trainDir points to the directory with training data, containing both an instances.jsonl and a truth.jsonl.

Test

python runClassifier.py -i inputDir -o outputDir -c xgboost:randomforrest

To run the classsifier specify the inputDir containing instances.jsonl, the outputDir where results.jsonl will be created, and the one of the classifiers xgboost or randomforrest

Local evaluation

python evaluate.py dir top_features

Evaluate both classifiers, dir is an optional input to compute the features from. If not specified the current available features file is used. top_features specifies the top most important features to be used.