Skip to content

aralara/DIQ-project-2022

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

DIQ-project-2022

Project of Data and Information Quality 2022-2023 course at Politecnico di Milano

The objective was to take two dirty datasets [Adult, Frogs] with different accuracy (50% - 90%) and evaluate the classification of the tuples with machine learing techniques before and after outlier detection. This evaluation was made with two different outlier detection techniques: standard with Z-score and advanced KNN. Then, the datasets were evaluated with RidgeClassifier and DecisionTreeClassifier to verify accuracy

  • The folder contains the dirty datasets and code used to perform the cleaning activities
  • The report explains the pipeline of the implementation and the obtained results

Final grade: 3/3

Group members

Releases

No releases published

Packages

No packages published