Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 893 Bytes

README.md

File metadata and controls

13 lines (9 loc) · 893 Bytes

(Meta)data Quality materials for Data Science Summer School 2023, Göttingen

This repository contains materials for the (Meta)data Quality session, part of the Göttingen Data Science Summer School, 2023. The first part functions as an introduction to the topic, in the second part students will learn how to work with real data. The repository contains data and code for this part, organising into 4 tasks:

  1. finding outlier in CSV
  2. counting elements in XML
  3. introduction to SHACL
  4. introduction to JSON Schema

The code are written in Python and the utilized tools are Python based. Instructions to setup virtual environments are available in the slides.