The Gender Prediction Model project aims to develop a machine learning model that can accurately predict the gender of a client using their transaction data. The project includes data preparation and analysis, model tuning, and evaluation metrics.
To start working on this project, follow these steps:
- Clone the repository:
git clone git@github.com:Melodiz/transaction-gender-prediction.git
- Navigate to the project directory:
cd Gender_transaction_base
- Install the required dependencies:
pip install -r requirements.txt
- Download the data from Kaggle
- Leave the unpacked data in a folder named
data
in the root of the repository.
The project's directory structure is as follows:
Gender_transaction_base/
├── LICENSE
├── README.md
├── gender_by_transaction.ipynb
├── requirements.txt
└── data/
├── train.csv
├── test.csv
├── mcc_codes.csv
├── transactions.csv
├── trans_types.csv
└── test_sample_submission.csv
- numpy
- pandas
- matplotlib
- seaborn
- catboost
- scikit-learn
- xgboost
- lightgbm
- nltk
- gensim
- @jupyter-widgets/base
- jquery
- lodash
- plotly.js-dist-min
The best score with k-fold cross-validation is 0.845 (ROC AUC).