Skip to content

Latest commit

 

History

History
137 lines (121 loc) · 5.55 KB

README.md

File metadata and controls

137 lines (121 loc) · 5.55 KB

Serving CLDF data from a clld app

Since the CLDF data model was informed by the database schema of the clld toolkit, it is not surprising that clld apps are well suited to serve CLDF data on the web (see for example WALS Online, which serves the WALS CLDF StructureDataset, or Dictionaria which serves the CLDF Dictionaries submitted to the Dictionaria Zenodo Community).

As an example, we'll go through the steps necessary to create an app serving Marc Tang's dataset of classifiers and plural markers DOI.

Bootstrapping a clld app

The easiest way to get started with a clld app serving a CLDF dataset is by bootstrapping the codebase running clld create:

  1. Create a fresh virtual environment for the app project and activate it:
    python -m virtualenv myenv
    source myenv/bin/activate
  2. Install clld in this environment:
    pip install "clld>=7.1.1"
  3. Install cookiecutter (which is needed for creating the app skeleton):
    pip install cookiecutter
  4. Create the project skeleton (run clld create -h for help on command options):
    clld create myapp cldf_module=StructureDataset
    We chose cldf_module=StructureDataset, because Tang's dataset contains typological data of the "standard" questionnaire format, which is best encoded as StructureDataset.

The project directory you just created should look like this:

$ tree myapp/
myapp/
├── CONTRIBUTING.md
├── development.ini
├── MANIFEST.in
├── myapp
│   ├── adapters.py
│   ├── appconf.ini
│   ├── assets.py
│   ├── datatables.py
│   ├── __init__.py
│   ├── interfaces.py
│   ├── locale
│   │   └── myapp.pot
│   ├── maps.py
│   ├── models.py
│   ├── scripts
│   │   ├── initializedb.py
│   │   ├── __init__.py
│   ├── static
│   │   ├── download
│   │   ├── project.css
│   │   └── project.js
│   ├── templates
│   │   ├── dataset
│   │   │   └── detail_html.mako
│   │   ├── myapp.mako
│   │   └── parameter
│   │       └── detail_html.mako
│   ├── tests
│   │   ├── conftest.py
│   │   ├── test_functional.py
│   │   └── test_selenium.py
│   └── views.py
├── requirements.txt
├── setup.cfg
├── setup.py
└── tox.ini

A clld app (i.e. the code in the "inner" myapp directory) is a regular python package. To make this package known (i.e. accessible/importable in python code), we have to install it. We'll do this as "editable" install and including the development and test dependencies:

cd myapp
pip install -r requirements.txt

Loading the CLDF data into the app's database

Loading data into a clld app's database is done through code in scripts/initializedb.py. This code will be executed when the clld initdb command is run. If a skeleton has been created passing a CLDF module for the cldf_module variable, exemplary code showing how to iterate over rows in CLDF tables and insert corresponding objects in the database will be inserted into scripts/initializedb.py. Thus, running clld initdb right away will already give us a working - if basic - app.

So, we retrieve the data from Zenodo:

cd ..
curl -o tangclassifiers.zip "https://zenodo.org/record/3889881/files/cldf-datasets/tangclassifiers-v1.zip?download=1"
unzip tangclassifiers.zip

The CLDF dataset is in the cldf subdirectory:

tree cldf-datasets-tangclassifiers-105b8f2/cldf
cldf-datasets-tangclassifiers-105b8f2/cldf
├── codes.csv
├── languages.csv
├── parameters.csv
├── requirements.txt
├── sources.bib
├── StructureDataset-metadata.json
└── values.csv

The code in scripts/initializedb.py also expects access to data of the Glottolog language catalog to enrich the data in the app, e.g. adding family affiliations for the languages in the sample. Thus we have to clone https://github.com/glottolog/glottolog or download a released version from Zenodo:

curl -o glottolog.zip "https://zenodo.org/record/3754591/files/glottolog/glottolog-v4.2.1.zip?download=1"
unzip glottolog.zip

Now we are ready to run

cd myapp
clld initdb \
--glottolog ../glottog-glottolog-d9da5e2/ \
--cldf ../cldf-datasets-tangclassifiers-105b8f2/cldf/StructureDataset-metadata.json \
development.ini

or if you cloned the repositories to your home directory:

clld initdb development.ini --cldf ~/cldf-datasets-tangclassifiers-105b8f2/cldf/StructureDataset-metadata.json --glottolog ~/glottolog-glottolog-d9da5e2/

and start the app at http://localhost:6543 via

pserve development.ini

We can also run the test suite, which will be useful for further development of the app:

pytest