A comparison between Statistical, Machine Learning, PCA, SVD, and REF methods
KDD'99 an Intrusion Detection Dataset
eclf = VotingClassifier(estimators=[('DecisionTreeClassifier', DTC), ('RandomForestClassifier', RFC),('ExtraTreesClassifier',ETC)], voting='hard')
_ = eclf.fit(X_train, y_train)
pred = eclf.score(X_test,y_test)
print("Acc: %0.10f" % (pred))
- Removing the features with the lowest correlation and the lowest standard deviation
.feature_importances_
from sklearn.ensemble.{x,y,z}- Recursive Feature Elimination
- Principal Component Analysis
- Singular Value Decomposition
- Acc: 0.9798537512
- Acc: 0.9737667829
- Acc: 0.9765543989
- if dims = 2, Acc: 0.9441938134
- Acc: 0.9619160483
- Acc: 0.9832743041
- Acc: 0.9865467229
- Acc: 0.9890784707
- Acc: 0.9935763632