4.1. Motif Search
4.2. Classification
4.3. Clustering
4.5. Applying FWHM scaling and adding other features for RFC
4.6. RFClassifier
Parkinson’s disease (PD) patients suffer from abnormal gait patterns. Therefore, monitoring and analysis of skeletal movements can aid in PD diagnosis. Several machine learning based models were developed to automate the differentiation of abnormal gait from normal gait, which can serve as a tool for diagnosis and monitoring the effect of PD treatment. This work aspires to find more complex structures in the time series gait data to introduce more efficient predictive modeling. The input to our algorithm is the gait in Parkinson’s Disease dataset maintained by Physionet. The Dataset contains a time series of vertical ground reaction force (VGRF) as gait measurements from 93 patients with Parkinson’s Disease and 73 healthy controls collected during walking at a normal phase.
Parkinson's disease (PD), a highly concerning neurodegenerative disorder that more than 10 million people are living with it worldwide. Symptoms pf Parkinson s disease, from the source [What is Parkinson's Disease Parkinson's NebraskaUR] :
For the later tasks, the following data set was used. The database contains measures of gait from 93 patients with idiopathic PD, and 73 healthy controls. The dataset contains:
- Vertical ground reaction force records of subjects as they walked for approximately 2 minutes on level ground.
• The file contains the measures from 8 sensors for each foot.
• Each individual walks for 2 minutes, records are taken at 100 samples per second.
Thus, we have 12000 record for each 2 mins walk.
- Demographics file contains demographic information, measures of disease severity and other related measures.
The following image shows the sensors underneath each foot[1]:
The dataclass contains read the data, segment it, scale it and iterpolate it.
The dataset is multidimensional time series data with periodic structure, the repeated pattern in the signals are slightly different which is used as the main subject of the study to perform classification with Machine Learning models
Applying random forest classifier on the statics from the raw signal data AFTER being filtered.
In the first file, tried to apply motif identification on different features from the time series dataset, it appeared that the shape of the dataset (pressure - no pressure) is what resulted as motif and that is not useful.
In the second file, Tried to filter the data, but still having the same problem.
Task | features | Accuracy |
---|---|---|
Severity Detection | Univariate classification (L2 sensor) | 0.39 |
Parkinson’s Classification | Univariate classification (L2 sensor) | 0.71 |
Severity Detection | Multivariate classification | 0.38 |
Parkinson’s Classification | Multivariate classification | 0.82 |
Clustering for fait time series dataset did not result in promising results. The first file tried to cluster the data from both lef and right feet. The second file applies the clustering on data from the left foot only. The third file applies the clustering on data from the right foot only.
Data Class + Applying FWHM on accumalated forces fro the right foot + applying models on the sequences.
Applying FWHM scaling and adding other features (stride time, max heel strike, max toe strike) for RFCs
- RFClassifier Class for training, predicting, scoring the results with Random Forest Classifier.
- RFC models In this code file, we use class data and RFC class to apply the previous different models on the data. (SUMMING UP)
- RFC basemodel using statics from raw data.
- RFC models on interpolated scaled stances with FWHM algorithm, and additional features
- RFC on scaled stances with 3 extra features and statics
- RFC on scaled stances with 6 extra features and statics
- RFC model on scaled stances from the right & left foot with 6 extra features each
- RFC model on scaled stances from the right & left foot with 6 extra features each and basemodels
- RFC model on scaled stances from 16 sensor from the right & left foot with 6 extra features each
- RFC model on scaled stances from 16 sensor from the right & left foot with the sum of all sensors and with 6 extra features each
- RFC model on scaled stances from 16 sensor from the right & left foot and the sum of all sensors and with 6 extra features each foot and the statics from base models
Model | Input data | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|
RFC n_est=200 | Statics on filtered raw data from each sensor | 0.8329 | 0.8716 | 0.9084 | 0.8844 |
RFC n_est=200 | Right foot related: [ Interpolated Scaled stances, 3 features] | 0.7092 | 0.7708 | 0.8176 | 0.7897 |
RFC n_est=200 | Right foot related: [ Interpolated Scaled stances, 6 features] | 0.7341 | 0.7896 | 0.8370 | 0.8815 |
-
Trying hybrid models trying different hybrid models for right stances and 3 features.
-
Hybrid model Hybrid model class and a train it on for right stances and 3 features and on for right stances and 6 features. Two final notebooks for training hybrid model on all data with/without statics for binary classification and severity Detection.
The propesd methodology:
The model architecture for Severity Detection Multiclass Classification
First file: Finding what features are more important on models trained on statics and extracted features
Second file: Showing the most important features in different plots.
As a result of model explaining and feature analysis, the most important features are
- Maximum force at heel strike.
- Maximum force from the accumulative signal.
- Swing time interval.
- Maximum force at toe off (Most different for people with higher levels of PD).
This research explores Parkinson's disease symptoms and their impact on gait analysis. Using a public dataset, this study proposes a methodology to classify neurological states based on multi-dimensional time series by dividing the data into its repetitive pattern components. This approach improves the accuracy of machine learning models for Parkinson's disease diagnosis and severity detection, achieving an accuracy score of 99.69% and 99.73%, respectively. The study highlight important spatiotemporal features, such as the maximum force experienced at toe off, for accurately diagnosing and monitoring Parkinson's disease. Overall, this study demonstrates the potential of using segmented signals from time series datasets along with extracted features to improve the accuracy of Parkinson's Disease diagnosis and severity detection.
References:
[1] Abdulhay E. et al. Gait and tremor investigation using machine learning techniques for the diagnosis of Parkinson disease // Future Generation Computer Systems. Elsevier BV, 2018 . Vol. 83 . P. 366 373
G. Gilmore, A. Gouelle, M. B. Adamson, M. Pieterman, and M. Jog, “Forward and backward walking in Parkinson disease: A factor analysis,” Gait & Posture, vol. 74, pp. 14–19, Oct. 2019, doi: 10.1016/J.GAITPOST.2019.08.005.