Virtual Knowledge Graph (VKG) over the Open Data Hub (ODH) powered by Ontop and curated by Ontopic.
Table of contents
- odh-vkg
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
For a ready to use Docker environment with all prerequisites already installed and prepared, you can check out the Docker environment section.
Get a copy of the repository:
git clone https://github.com/noi-techpark/odh-vkg.git
Change directory:
cd odh-vkg/
- Create the
.env
file in which, amongst all, the SPARQL endpoint port and the PG external port (for debugging purposes) are specifiedcp .env.example .env
- Start the Docker container (see the dedicated section)
- Visit the SPARQL endpoint
- Now we can open the link http://localhost:8080/portal/ in the browser and test some SPARQL queries
- Note that synchronisation between the master and the slave takes some time. Until it is finished, some queries may return empty results.
For the project a Docker environment is already prepared and ready to use with all necessary prerequisites.
The default Docker Compose file (docker-compose.yml) uses 3 containers:
- A PostgreSQL DB containing a fragment of the ODH Tourism dataset
- Ontop as SPARQL endpoint
- Nginx as reverse proxy and cache
Install Docker (with Docker Compose) locally on your machine.
To start the container on the foreground:
docker-compose pull && docker-compose up --build
The container is run on the foreground and can be stopped by pressing CTRL-C.
To start the container on the background:
docker-compose pull && docker-compose up --build -d
To stop it:
docker-compose down
A second Docker-compose file (docker-compose.auth.yml
) can be used for testing
access control policies. It requires a running and configurable instance of
Keycloak. See https://github.com/noi-techpark/authentication-server for
instructions on how to install it locally. Refer to
docs/authentication.md for instruction on how to
configure Keycloak and the authentication proxy.
All NOI specific infrastructure documentation and scripts can be found inside
the infrastructure
folder. See
infrastructure/README.md for details.
The SPARQL endpoints do not query directly the production database but slave
read-only instances, which are synchronized with the master database through two
sync-script with scheduled regular executions. The mobility
sync can be found
under
infrastructure/utils/mobility-sync/,
whereas the tourism
sync is an external program handled directly from the
Tourism servers.
- Landing page:
/
- Public SPARQL endpoint:
/sparql
- Public portal:
/portal/
- Public predefined queries:
/predefined/
- Portal with restricted access:
/restricted/
- SPARQL endpoint with restricted access:
/restricted/sparql
- Predefined queries with restricted access:
/restricted/predefined/
For building a newer version of the Docker image of the test database out of a fresh dump, please refer to Tourism master. This Docker image is published on Docker Hub.
For support, please contact help@opendatahub.com.
If you'd like to contribute, please follow the following instructions:
- Fork the repository.
- Checkout a topic branch from the
main
branch. - Make sure the tests are passing.
- Create a pull request against the
main
branch.
More documentation can be found at https://docs.opendatahub.com.
The code in this project is licensed under the GNU AFFERO GENERAL PUBLIC LICENSE Version 3 license. See the LICENSE.md file for more information.
This project is REUSE compliant, more information about the usage of REUSE in NOI Techpark repositories can be found here.
Since the CI for this project checks for REUSE compliance you might find it useful to use a pre-commit hook checking for REUSE compliance locally. The pre-commit-config file in the repository root is already configured to check for REUSE compliance with help of the pre-commit tool.
Install the tool by running:
pip install pre-commit
Then install the pre-commit hook via the config file by running:
pre-commit install
Some examples of possible SPARQL queries can be found in the SPARQL Queries folder. You can take a look at some data quality queries here and at some regular queries here.
The schema of the VKG can be visualized in the dedicated page.