Skip to content

Auto-UFSTool - An Automatic MATLAB Toolbox for Unsupervised Feature Selection

Notifications You must be signed in to change notification settings

farhadabedinzadeh/AutoUFSTool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Auto-UFSTool - An Automatic MATLAB Toolbox for Unsupervised Feature Selection

If you found this toolbox useful, kindly cite the following article:

Auto-UFSTool: An Automatic Unsupervised Feature Selection Toolbox for MATLAB

Cite it as:

Abedinzadeh Torghabeh, F., Modaresnia, Y., Hosseini, S. A. (2023). 'Auto-UFSTool: An Automatic Unsupervised
Feature Selection Toolbox for MATLAB',Journal of AI and Data Mining, (), pp. -. doi: 10.22044/jadm.2023.12820.2434

Abstract

  • Several open resource toolboxes provide feature selection algorithms to decrease redundant features, data dimensionality, and computing costs. These approaches demand programming expertise, limiting their popularity, and they haven't adequately addressed unlabeled real-world data. Automatic MATLAB Toolbox for Unsupervised Feature Selection (Auto-UFSTool) is a library for 23 robust Unsupervised Feature Selection techniques. We aim to develop a user-friendly and fully automatic toolbox utilizing various unsupervised feature selection methodologies from the latest research. It is freely available in the MATLAB File Exchange repository, and the script and source code of each technique are included. Therefore, a clear and systematic comparison of alternative approaches is possible without requiring a single line of code.

Introduction

  • This toolbox offers more than 20 Unsupervised Feature Selection methods.
  • Almost half of them have been proposed in the last five years.
  • This toolbox is user-friendly. After loading the data, users may launch certain procedures and applications without writing a single line of code.

Usage

In the presence of an input matrix X(m×n)(m samples and n features per sample), the process for utilizing one of the UFS methods in the toolbox is as follows:

Result = Auto_UFSTool(X,Selection_Method);    (1)

where Result represents the output rank indexes of features in descending order of their relative importance or subset of feature. As illustrated in (1) a user can utilize any UFS method using an interface main.m.

  • Result : Rank indexes of features in descending order of their relative importance or Feature subset.
  • Selection_Method : Selected Unsupervised Feature Selection Method
  • X(m×n) : parameter settings
    • m : Samples
    • n : Features per samples

It is demonstrated with an example. Based on the COIL20 dataset. The COIL20 is a library of images from Columbia containing 20 objects. As each object is rotated on a turntable, 72 images were captured at 5 degrees apart, each containing 72 images. Each image is 32 by 32 pixels and contains 256 grey levels per pixel. As a result, with the input X, m = 1440 and n = 1024. After loading the data, one line of code to utilize the Unsupervised Feature Selection via Adaptive Graph Learning and Constraint (EGCFS) algorithm is presented below.

Result = Auto_UFSTool(X,'EGCFS')                (2)    

Note

  • It is important to note that all the options and parameters of the methods will be automatically received from the user or their default values may be used when the method is implemented, not to mention that all UFS methods' names are mentioned in the UFS_Names.mat file. For any further information, kindly see the original publications and algorithm implementations.
  • The toolbox is written in MATLAB, a prominent programming language for machine learning and pattern recognition research. The Auto-UFSTool was tested on 64bit Windows 8/10/11 PCs with MATLAB R2019b/R2022a on a range of publicly available datasets based on original articles
  • To run this Code, you will need to add the functions and UFSsfolder to your MATLAB path And then run main.m.
  • The Auto-UFSTool was tested on 64-bit Windows 8/10/11 PCs with MATLAB R2019b/R2022a on a range of publicly available datasets based on original articles.

Table1: UFS names, their Type which is f = filters, w = wrappers, h = hybrid, and e = embedding methods, the abbreviation of their names

No. Abbreviation Article Name
1 'CFS' Gene Selection for Cancer Classification using Support Vector Machines
2 'LS' Laplacian Score for Feature Selection
3 'SPEC' Spectral Feature Selection for Supervised and Unsupervised Learning
4 'MCFS' Unsupervised feature selection for Multi-Cluster data
5 'UDFS' ℓ2,1-Norm regularized discriminative feature selection for unsupervised learning
6 'LLCFS' Feature Selection and Kernel Learning for Local Learning-Based Clustering
7 'NDFS' Unsupervised Feature Selection Using Nonnegative Spectral Analysis
8 'RUFS' Robust Unsupervised Feature Selection
9 'FSASL' Unsupervised feature selection with adaptive structure learning
10 'SCOFS' Unsupervised Simultaneous Orthogonal Basis Clustering Feature Selection
11 'SOGFS' Unsupervised Feature Selection with Structured Graph Optimization
12 'UFSOL' Unsupervised feature selection with ordinal locality
13 'Inf-FS' Infinite Feature Selection
14 'DGUFS' Dependence guided unsupervised feature selection
15 'SRCFS' Unsupervised feature selection with multi-subspace randomization and collaboration
16 'CNAFS' Convex Non-Negative Matrix Factorization With Adaptive Graph for Unsupervised Feature Selection
17 'EGCFS' Unsupervised Feature Selection via Adaptive Graph Learning and Constraint
18 'RNE' Robust neighborhood embedding for unsupervised feature selection
19 'Inf-FS2020' Infinite Feature Selection: A Graph-based Feature Filtering Approach
20 'UAR-HKCMI' Fuzzy complementary entropy using hybrid-kernel function and its unsupervised attribute reduction
21 'FMIUFS' A Novel Unsupervised Approach to Heterogeneous Feature Selection Based on Fuzzy Mutual Information
22 'FRUAR' Unsupervised attribute reduction for mixed data based on fuzzy
23 'U2FS' Utility metric for unsupervised feature selection

Evaluation

K-Means clustering and 8 evaluation metrics can also be used to compare and evaluate the results of feature selection algorithms.

No. Metric
1 Redundancy
2 Jaccard score
3 Purity
4 NMI
5 Accuracy
6 Precision
7 Recall
8 F-measure

GUI

autoufss

Documentation

For further questions, please read the source articles or feel free to contact developers.

Mail Yahoo - y.modaresnia@yahoo.com