Skip to content

v1.2

Latest
Compare
Choose a tag to compare
@farhadabedinzadeh farhadabedinzadeh released this 10 Mar 09:22
· 4 commits to main since this release
c9ba97c

Auto-UFSTool - An Automatic MATLAB Toolbox for Unsupervised Feature Selection

Abstract

  • Several open resource toolboxes provide feature selection algorithms to decrease redundant features, data dimensionality, and computing costs.
    These approaches demand programming expertise, limiting their popularity, and they haven't adequately addressed unlabeled real-world data. Automatic MATLAB Toolbox for Unsupervised Feature Selection (Auto-UFSTool) is a library for 23 robust Unsupervised Feature Selection techniques. We aim to develop a user-friendly and fully-automatic toolbox utilizing various unsupervised feature selection methodologies from the latest research. It is freely available in the MATLAB File Exchange repository, and each technique's script and source code are included. Therefore, a clear and systematic comparison of alternative approaches is possible without requiring a single line of code.

Introduction

  • This toolbox offers more than 20 Unsupervised Feature Selection methods.
  • Almost half of them have been proposed in the last five years.
  • This toolbox is user-friendly. After loading the data, users may launch certain procedures and applications without writing a single line of code.

Usage

In the presence of an input matrix X(m×n)(m samples and n features per sample), the process for utilizing one of the UFS methods in the toolbox is as follows:

Result = Auto_UFSTool(X,Selection_Method);    (1)

where Result represents the output rank indexes of features in descending order of their relative importance or a subset of feature.
As illustrated in (1) a user can utilize any UFS method using an interface main.m.

  • Result : Rank indexes of features in descending order of their relative importance or Feature subset.
  • Selection_Method : Selected Unsupervised Feature Selection Method
  • X(m×n) : parameter settings
    • m : Samples
    • n : Features per samples

It is demonstrated with an example. Based on the COIL20 dataset. The COIL20 is a library of images from Columbia containing 20 objects. As each object is rotated on a turntable, 72 images were captured at 5 degrees apart, and each object contains 72 images. Each image is 32 by 32 pixels and contains 256 grey levels per pixel.
As a result, with the input X, m = 1440 and n = 1024.
After loading the data, one line of code to utilize the Unsupervised Feature Selection via Adaptive Graph Learning and Constraint (EGCFS) algorithm is presented below.

Result=Auto_UFSTool(X,'EGCFS')                (2)    

Note

  • It is important to note that all the options and parameters of the methods will be automatically received from the user or their default values may be used when the method is implemented, not to mention that all UFS methods' names are mentioned in the UFS_Names.mat file. For any more information, please take a look at the original publications and algorithm implementations.
  • The toolbox is written in MATLAB, a prominent programming language for machine learning and pattern recognition research.
    The Auto-UFSTool was tested on 64bit Windows 8/10/11 PCs with MATLAB R2019b/R2022a on a range of publicly available datasets based on original articles
  • To run this Code, you will need to add the functions and UFSsfolder to your MATLAB path
    And then run main.m.
  • The Auto-UFSTool was tested on 64-bit Windows 8/10/11 PCs with MATLAB R2019b/R2022a on a range of publicly available datasets based on original articles.

Table1: UFS names, their Type which is f = filters, w = wrappers, h = hybrid, and e = embedding methods, the abbreviation of their names

No. Abbreviation Article Name
1 'CFS' Gene Selection for Cancer Classification using Support Vector Machines
2 'LS' Laplacian Score for Feature Selection
3 'SPEC' Spectral Feature Selection for Supervised and Unsupervised Learning
4 'MCFS' Unsupervised feature selection for Multi-Cluster data
5 'UDFS' ℓ2,1-Norm regularized discriminative feature selection for unsupervised learning
6 'LLCFS' Feature Selection and Kernel Learning for Local Learning-Based Clustering
7 'NDFS' Unsupervised Feature Selection Using Nonnegative Spectral Analysis
8 'RUFS' Robust Unsupervised Feature Selection
9 'FSASL' Unsupervised feature selection with adaptive structure learning
10 'SCOFS' Unsupervised Simultaneous Orthogonal Basis Clustering Feature Selection
11 'SOGFS' Unsupervised Feature Selection with Structured Graph Optimization
12 'UFSOL' Unsupervised feature selection with ordinal locality
13 'Inf-FS' Infinite Feature Selection
14 'DGUFS' Dependence guided unsupervised feature selection
15 'SRCFS' Unsupervised feature selection with multi-subspace randomization and collaboration
16 'CNAFS' Convex Non-Negative Matrix Factorization With Adaptive Graph for Unsupervised Feature Selection
17 'EGCFS' Unsupervised Feature Selection via Adaptive Graph Learning and Constraint
18 'RNE' Robust neighborhood embedding for unsupervised feature selection
19 'Inf-FS2020' Infinite Feature Selection: A Graph-based Feature Filtering Approach
20 'UAR-HKCMI' Fuzzy complementary entropy using hybrid-kernel function and its unsupervised attribute reduction
21 'FMIUFS' A Novel Unsupervised Approach to Heterogeneous Feature Selection Based on Fuzzy Mutual Information
22 'FRUAR' Unsupervised attribute reduction for mixed data based on fuzzy
23 'U2FS' Utility metric for unsupervised feature selection

Evaluation

K-Means clustering and 8 evaluation metrics can also be used to compare and evaluate the results of feature selection algorithms.

No. Metric
1 Redundancy
2 Jaccard score
3 Purity
4 NMI
5 Accuracy
6 Precision
7 Recall
8 F-measure

GUI

autoufss

Documentation

If you have any more questions, please read the source articles or feel free to contact developers.

Mail
Yahoo - y.modaresnia@yahoo.com