Set of algorithms for preprocessing (feature selection and dataset balancing) high-dimensional datasets. They are implemented as part of the Java-based Weka API for machine learning. This package contains algorithms published in the following journals and papers:
Speeding up incremental wrapper feature subset selection with Naive Bayes classifier. P Bermejo, JA Gámez, JM Puerta. nowledge-Based Systems. Volume 55, pp. 140-147. 2014. https://doi.org/10.1016/j.knosys.2013.10.016
Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking. P Bermejo, L de la Ossa, JA Gámez, JM Puerta Knowledge-Based Systems 25 (1), 35-44. 2012. https://doi.org/10.1016/j.knosys.2011.01.015
Improving incremental wrapper-based feature subset selection by using re-ranking P Bermejo, JA Gámez, JM Puerta. International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. pp 580-589. 2010.
Improving Incremental Wrapper-Based Subset Selection via Replacement and Early Stopping. P Bermejo, JA Gámez, JM Puerta. International Journal of Pattern Recognition and Artificial Intelligence. Volume 25, pp. 605-625. 2011. https://doi.org/10.1142/S0218001411008804
Incremental wrapper-based subset selection with replacement: An advantageous alternative to sequential forward selection. P Bermejo, JA Gámez, JM Puerta. 2009 IEEE Symposium on Computational Intelligence and Data Mining, 367-374.
Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. P Bermejo, JA Gámez, JM Puerta Expert Systems with Applications 38 (3), 2072-2080. 2011. https://doi.org/10.1016/j.eswa.2010.07.146