Classification Approach using Bogota-Colombia homeless survey dataset

Introduction

This repository shows a training model that applying four supervised classification algorithms, using 14 features and one binary label, testing a dataset publicated in National Colombian Data Archive '(ANDI)'

Dataset

The dataset, provide by the study named as COLOMBIA-Censo de Habitantes de Calle - CHC 2017 - Bogotá, D.C', from this one, has been selected 14 predictive features (source: http://microdatos.dane.gov.co/index.php/catalog/548/study-description):

P22: Principal reason that started living as homeless (Discrete-Nominal)
P26_1: Recived helped by (Discrete-Nominal)
P30_1: Consumed drugs (Discrete-Nominal)
P9R: Gender (Discrete-Binary)
P17S3: strugglin Mental and emotional issues? (Discrete-Binary)
P16S9: strugglin Respiratory and cardiac issues on daily activities? (Discrete-Ordinal)
P17S7: Sexually transmitted disease? (Discrete-Binary)
P28R: Education (Discrete-Ordinal)
P16S6: Learn, remind, handle decision? (Discrete-Ordinal)
P8: Age (Discrete-Numeric)
P29: Money source (Discrete-Nominal)
P31: There's some understanding about government rehab programs? (Discrete-Binary)
P33_1: Fear for one's life (Discrete-Binary)
P24: Which reason do you still living on the streets? (Discrete-Nominal)

There's one target:

P32: Do you use goverments rehab programs? (Discrete-Binary)

Approach

Has been employing four models for classification target:

K-Nearest Neighbor
Decision Tree
Logistic Regression
Support Vector Machine

Testing in some parameters that could improve particular model's final accuracy

Results

Taking as indicator f1 weighted average score, might said SVM approach with linear kernel parameter shows better results, introduced by following table:

          precision    recall  f1-score   support

       1       0.73      0.80      0.76       361
       2       0.75      0.68      0.72       329
accuracy                           0.74       690
macro avg       0.74     0.74      0.74       690
weighted avg    0.74     0.74      0.74       690

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ANDI Homeless Dataset Classification Approach.ipynb		ANDI Homeless Dataset Classification Approach.ipynb
CHC_2017.csv		CHC_2017.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classification Approach using Bogota-Colombia homeless survey dataset

Introduction

Dataset

Approach

Results

About

Releases

Packages

Languages

jairohndzmos/ClassificationApproachHomelessDataset

Folders and files

Latest commit

History

Repository files navigation

Classification Approach using Bogota-Colombia homeless survey dataset

Introduction

Dataset

Approach

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages