- Further information about the Summer School: https://www.eresearch.uni-goettingen.de/news-and-events/summer-schools/data-science-summer-school-2023/
- The slides of the (Meta)data Quality session: https://bit.ly/qa-dsss2023
This repository contains materials for the (Meta)data Quality session, part of the Göttingen Data Science Summer School, 2023. The first part functions as an introduction to the topic, in the second part students will learn how to work with real data. The repository contains data and code for this part, organising into 4 tasks:
- finding outlier in CSV
- counting elements in XML
- introduction to SHACL
- introduction to JSON Schema
The code are written in Python and the utilized tools are Python based. Instructions to setup virtual environments are available in the slides.