Skip to content

EU-TEACHING/teaching-automotive-demo

Repository files navigation

teaching-automotive-demo

Repository for the Simulator Study #2

Stress Preprocessor

Data preprocessing classes for offline and online pipelines

Run the offline preprocessor

First make sure stress_preprocessor/config/config.json is up to date with the latest data schema information.
Run the offline preprocessor:

python offline_preprocess_run.py -sid [subj_id]

Steps

  • load the datasets
  • assign the stress events
  • save the new datasets
  • clean and validate timestamps
  • features extraction using neurokit
  • save preprocessed datasets

The -sid option is required and specifies the participant's ID.

Stress Events

During the offline preprocessing after the datasets are loaded, stress events are being assigned to them and then are saved into csv files.

The stress events are:

"1": {
    "accelerate_to_motorway": [[65, 75]],
    "cut_in_from_another_vehicle": [[87, 97]],
    "sharp_brake": [[93, 103]]},
"2": {
    "join_platoon": [[38, 48]],
    "platooning": [[70, 80], [88, 147]],
    "platoon_vehicle_cut_out": [[77, 87]]},
"3": {
    "traffic_light_sharp_break": [[33, 53]],
    "phantom_break": [[72, 92]],
    "road_crossing": [[103, 123]]},
"6": {
    "traffic_light": [[39, 49], [104, 114],
                      [133, 143], [203, 213]],
    "phantom_break": [[66, 76], [176, 186]],
    "pedestrian_crossing": [78, 88],
    "cut_in_from_a_vehicle": [[308, 318], [533, 543]],
    "join_platoon_at_motorway": [[655, 665]],
    "platoon_vehicle_cutting_out": [[661, 671], [677, 687]]},
"X": {"traffic_light_slow_down": [[20, 30]],
      "pedestrian_crossing": [[59, 69]]}
}

For the timings that no stress event is happening the value is "normal". For the baseline dataset there are no stress events, the Stress_Events column is None.

Extracted features

Neurokit2 is used for the ECG and EDA feature extraction.

ECG

  • "ECG_Raw": the raw signal.
  • "ECG_Clean": the cleaned signal.
  • "ECG_R_Peaks": the R-peaks marked as “1” in a list of zeros.
  • "ECG_Rate": heart rate interpolated between R-peaks.
  • "ECG_P_Peaks": the P-peaks marked as “1” in a list of zeros
  • "ECG_Q_Peaks": the Q-peaks marked as “1” in a list of zeros .
  • "ECG_S_Peaks": the S-peaks marked as “1” in a list of zeros.
  • "ECG_T_Peaks": the T-peaks marked as “1” in a list of zeros.
  • "ECG_P_Onsets": the P-onsets marked as “1” in a list of zeros.
  • "ECG_P_Offsets": the P-offsets marked as “1” in a list of zeros (only when method in ecg_delineate() is wavelet).
  • "ECG_T_Onsets": the T-onsets marked as “1” in a list of zeros (only when method in ecg_delineate() is wavelet).
  • "ECG_T_Offsets": the T-offsets marked as “1” in a list of zeros.
  • "ECG_R_Onsets": the R-onsets marked as “1” in a list of zeros (only when method in ecg_delineate() is wavelet).
  • "ECG_R_Offsets": the R-offsets marked as “1” in a list of zeros (only when method in ecg_delineate() is wavelet).
  • "ECG_Phase_Atrial": cardiac phase, marked by “1” for systole and “0” for diastole.
  • "ECG_Phase_Ventricular": cardiac phase, marked by “1” for systole and “0” for diastole.
  • "ECG_Atrial_PhaseCompletion": cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
  • "ECG_Ventricular_PhaseCompletion": cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
ecg_feats_dfs[0].columns
Out[2]: 
Index(['ECG_Raw', 'ECG_Clean', 'ECG_Rate', 'ECG_Quality', 'ECG_R_Peaks',
       'ECG_P_Peaks', 'ECG_P_Onsets', 'ECG_P_Offsets', 'ECG_Q_Peaks',
       'ECG_R_Onsets', 'ECG_R_Offsets', 'ECG_S_Peaks', 'ECG_T_Peaks',
       'ECG_T_Onsets', 'ECG_T_Offsets', 'ECG_Phase_Atrial',
       'ECG_Phase_Completion_Atrial', 'ECG_Phase_Ventricular',
       'ECG_Phase_Completion_Ventricular'],
      dtype='object')

EDA

  • "EDA_Raw": the raw signal.
  • "EDA_Clean": the cleaned signal.
  • "EDA_Tonic": the tonic component of the signal, or the Tonic Skin Conductance Level (SCL).
  • "EDA_Phasic": the phasic component of the signal, or the Phasic Skin Conductance Response (SCR).
  • "SCR_Onsets": the samples at which the onsets of the peaks occur, marked as “1” in a list of zeros.
  • "SCR_Peaks": the samples at which the peaks occur, marked as “1” in a list of zeros.
  • "SCR_Height": the SCR amplitude of the signal including the Tonic component. Note that cumulative effects of close-occurring SCRs might lead to an underestimation of the amplitude.
  • "SCR_Amplitude": the SCR amplitude of the signal excluding the Tonic component.
  • "SCR_RiseTime": the time taken for SCR onset to reach peak amplitude within the SCR.
  • "SCR_Recovery": the samples at which SCR peaks recover (decline) to half amplitude, marked as “1” in a list of zeros.
eda_feats_dfs[0].columns
Out[4]: 
Index(['EDA_Raw', 'EDA_Clean', 'EDA_Tonic', 'EDA_Phasic', 'EDA_SCR_Onsets',
       'EDA_SCR_Peaks', 'EDA_SCR_Height', 'EDA_SCR_Amplitude',
       'EDA_SCR_RiseTime', 'EDA_SCR_Recovery', 'EDA_SCR_RecoveryTime'],
      dtype='object')

First order differences

Defined in 'stress_preprocessor/config/offline_config.json' with key "fod_feats". The features in this list are selected from EDA and ECG feature names.

Final dataframe

final_df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 124000 entries, 0 to 15499
Data columns (total 38 columns):
 #   Column                            Non-Null Count   Dtype  
---  ------                            --------------   -----  
 0   Time                              124000 non-null  object 
 1   ScenarioID                        123999 non-null  object 
 2   RepetitionID                      123999 non-null  object 
 3   Stress                            123999 non-null  float32
 4   ECG_Raw                           123999 non-null  float32
 5   ECG_Clean                         124000 non-null  float64
 6   ECG_Rate                          124000 non-null  float64
 7   ECG_Quality                       124000 non-null  float64
 8   ECG_R_Peaks                       124000 non-null  int64  
 9   ECG_P_Peaks                       124000 non-null  int64  
 10  ECG_P_Onsets                      124000 non-null  int64  
 11  ECG_P_Offsets                     124000 non-null  int64  
 12  ECG_Q_Peaks                       124000 non-null  int64  
 13  ECG_R_Onsets                      124000 non-null  int64  
 14  ECG_R_Offsets                     124000 non-null  int64  
 15  ECG_S_Peaks                       124000 non-null  int64  
 16  ECG_T_Peaks                       124000 non-null  int64  
 17  ECG_T_Onsets                      124000 non-null  int64  
 18  ECG_T_Offsets                     124000 non-null  int64  
 19  ECG_Phase_Atrial                  123504 non-null  float64
 20  ECG_Phase_Completion_Atrial       124000 non-null  float64
 21  ECG_Phase_Ventricular             123504 non-null  float64
 22  ECG_Phase_Completion_Ventricular  124000 non-null  float64
 23  EDA_Raw                           123999 non-null  float32
 24  EDA_Clean                         124000 non-null  float64
 25  EDA_Tonic                         124000 non-null  float64
 26  EDA_Phasic                        124000 non-null  float64
 27  EDA_SCR_Onsets                    124000 non-null  int64  
 28  EDA_SCR_Peaks                     124000 non-null  int64  
 29  EDA_SCR_Height                    124000 non-null  float64
 30  EDA_SCR_Amplitude                 124000 non-null  float64
 31  EDA_SCR_RiseTime                  124000 non-null  float64
 32  EDA_SCR_Recovery                  124000 non-null  int64  
 33  EDA_SCR_RecoveryTime              124000 non-null  float64
 34  ECG_Raw_diff                      123998 non-null  float32
 35  EDA_Raw_diff                      123998 non-null  float32
 36  ECG_Clean_diff                    124000 non-null  float64
 37  EDA_Clean_diff                    124000 non-null  float64
dtypes: float32(5), float64(16), int64(14), object(3)
memory usage: 34.5+ MB

ECG AND eda PLOTS

Saved at stress_preprocessor/graphs

Run the online preprocessor

To do the online preprocessing initialize the Preprocessor and then call Preprocessor's "online_run" passing the streaming dictionary to it. The Preprocessor is located in stress_preprocessor/preprocessors/preprocessor.py

To run a mock preprocessing, run:

python online_preprocess_run.py

Remaining issues

  1. Visualize raw time series, must be adapted from previous version
  2. Error handling based on the error_col, replace row with null values
  3. Imputation is currently handled by Neurokit2. Use visualization if something is really wrong to remove whole participant's data
  4. Null values concerning the initial features exist in the final df

About

Repository for the Simulator Study #2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •