Auto-UFSTool - An Automatic MATLAB Toolbox for Unsupervised Feature Selection
Abstract
- Several open resource toolboxes provide feature selection algorithms to decrease redundant features, data dimensionality, and computing costs.
These approaches demand programming expertise, limiting their popularity, and they haven't adequately addressed unlabeled real-world data. Automatic MATLAB Toolbox for Unsupervised Feature Selection (Auto-UFSTool) is a library for 23 robust Unsupervised Feature Selection techniques. We aim to develop a user-friendly and fully-automatic toolbox utilizing various unsupervised feature selection methodologies from the latest research. It is freely available in the MATLAB File Exchange repository, and each technique's script and source code are included. Therefore, a clear and systematic comparison of alternative approaches is possible without requiring a single line of code.
Introduction
- This toolbox offers more than 20 Unsupervised Feature Selection methods.
- Almost half of them have been proposed in the last five years.
- This toolbox is user-friendly. After loading the data, users may launch certain procedures and applications without writing a single line of code.
Usage
In the presence of an input matrix X(m×n)(m samples and n features per sample), the process for utilizing one of the UFS methods in the toolbox is as follows:
Result = Auto_UFSTool(X,Selection_Method); (1)
where Result represents the output rank indexes of features in descending order of their relative importance or a subset of feature.
As illustrated in (1) a user can utilize any UFS method using an interface main.m
.
Result
: Rank indexes of features in descending order of their relative importance or Feature subset.Selection_Method
: Selected Unsupervised Feature Selection MethodX(m×n)
: parameter settingsm
: Samplesn
: Features per samples
It is demonstrated with an example. Based on the COIL20
dataset. The COIL20 is a library of images from Columbia containing 20 objects. As each object is rotated on a turntable, 72 images were captured at 5 degrees apart, and each object contains 72 images. Each image is 32 by 32 pixels and contains 256 grey levels per pixel.
As a result, with the input X, m = 1440
and n = 1024
.
After loading the data, one line of code to utilize the Unsupervised Feature Selection via Adaptive Graph Learning and Constraint (EGCFS
) algorithm is presented below.
Result=Auto_UFSTool(X,'EGCFS') (2)
Note
- It is important to note that all the options and parameters of the methods will be automatically received from the user or their default values may be used when the method is implemented, not to mention that all UFS methods' names are mentioned in the
UFS_Names.mat
file. For any more information, please take a look at the original publications and algorithm implementations. - The toolbox is written in MATLAB, a prominent programming language for machine learning and pattern recognition research.
The Auto-UFSTool was tested on 64bit Windows 8/10/11 PCs with MATLAB R2019b/R2022a on a range of publicly available datasets based on original articles - To run this Code, you will need to add the
functions
andUFSs
folder to your MATLAB path
And then runmain.m
. - The Auto-UFSTool was tested on 64-bit Windows 8/10/11 PCs with MATLAB R2019b/R2022a on a range of publicly available datasets based on original articles.
Table1: UFS names, their Type which is f = filters, w = wrappers, h = hybrid, and e = embedding methods, the abbreviation of their names
Evaluation
K-Means clustering and 8 evaluation metrics can also be used to compare and evaluate the results of feature selection algorithms.
No. | Metric |
---|---|
1 | Redundancy |
2 | Jaccard score |
3 | Purity |
4 | NMI |
5 | Accuracy |
6 | Precision |
7 | Recall |
8 | F-measure |
GUI
Documentation
If you have any more questions, please read the source articles or feel free to contact developers.