Leveraging data from Spain's European Health Survey, we aim to revolutionize mental health strategies by identifying individuals at risk before symptoms emerge, using state-of-the-art machine learning models to predict and address the early stages of depression or anxiety.
This directory contains the source code along with Jupyter notebooks for various stages of the project:
0_json_maker.ipynb
: This notebook extracts the column names from the data Excel file and converts them into JSON format for convenient usage throughout the project.1_data_cleaning_main.ipynb
: Conducts data cleaning, converts Excel to Pandas DataFrame, and exports it to CSV format.2_expl_data_analysis_json.ipynb
: Performs Exploratory Data Analysis (EDA) including univariate and multivariate analysis, correlation analysis, and splitting into train and test datasets.3_hyperparam_search_chatgpt.ipynb
: Utilizes ChatGPT for web scraping to obtain optimal hyperparameters for the model along with their corresponding explanations.4_machine_learning_model_main.ipynb
: Trains the machine learning model.4_server-up.ipynb
: Creates a server using Python to host our project presentation web page.
This directory contains exported machine learning models in pickle format for reuse:
selectkbest
: Contains pickle files generated by iterating with the SelectKBest method, representing a range of selected variables.
This directory includes HTML code for our project presentation webpage:
web
: Contains images, tables, and other files used in the webpage.
This directory holds various versions of the project data:
interim
: Final versions of cleaned data.json_files
: Storage for dictionaries of variables extracted in JSON format.processed
: Contains splits of X and Y for train and test datasets.raw
: Original data downloaded from INE, as it came.sqlite
: Our table exported to SQLite database format.
Primarily contains images extracted from our notebooks and other project-related assets.