Name		Name	Last commit message	Last commit date
parent directory ..
model		model
saved_results		saved_results
README.md		README.md
requirements.txt		requirements.txt
svm.py		svm.py

README.md

svm

Implementation of support vector machine algorithm to classify gender, based on comments from instagram media(s).

Setup

Assuming you have virtualenv installed:

virtualenv venv && source venv/bin/activate
pip install -r requirements.txt
python3 svm.py --help

$ ./svm.py -h
usage: svm.py [-h] [-l LIMIT] [-e LIMIT_PER_GENDER] [-c] [-g GAMMA]
              [-k KERNEL]

optional arguments:
  -h, --help            show this help message and exit
  -l LIMIT, --limit LIMIT
                        Limit processed data, equal male and female comments
  -e LIMIT_PER_GENDER, --limit-per-gender LIMIT_PER_GENDER
                        Limit per gender
  -c, --cache           Cache processed raw data
  -g GAMMA, --gamma GAMMA
                        Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’
  -k KERNEL, --kernel KERNEL
                        Specifies the kernel type to be used in the algorithm

deactivate

Words Blacklist

Put blacklisted words in data/blacklist.txt (one word per line)

They will be read at the beginning of the program.

Example

$ ./svm.py --limit 1000 --kernel rbf
Running SVM Classifier
Reading blacklist words file
Reading raw gender-comment data
Loaded 23499 male and 51030 female comments
Limiting male and female comments to 500 each (1000 total)
Total of 3050 words found
Processing 1000 raw gender-comment data: [########################################]

Running tests
=============
Kernel: rbf
Gamma: 1
=============

Test-1
> Training model...
> Predicting test data...
> Accuracy: 52.00%

Test-2
> Training model...
> Predicting test data...
> Accuracy: 52.00%

Test-3
> Training model...
> Predicting test data...
> Accuracy: 52.80%

Test-4
> Training model...
> Predicting test data...
> Accuracy: 45.60%

Test-5
> Training model...
> Predicting test data...
> Accuracy: 58.40%

Test-6
> Training model...
> Predicting test data...
> Accuracy: 57.60%

Test-7
> Training model...
> Predicting test data...
> Accuracy: 50.40%

Test-8
> Training model...
> Predicting test data...
> Accuracy: 49.60%

=====================================
Avg. Accuracy: 52.30%
Elapsed time: 62.53s

Reference

https://scikit-learn.org/stable/modules/svm.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

svm

svm

README.md

svm

Setup

Words Blacklist

Example

Reference

Files

svm

Directory actions

More options

Directory actions

More options

Latest commit

History

svm

Folders and files

parent directory

README.md

svm

Setup

Words Blacklist

Example

Reference