KGTK: Knowledge Graph Toolkit

The Knowledge Graph Toolkit (KGTK) is a comprehensive framework for the creation and exploitation of large hyper-relational knowledge graphs (KGs), designed for ease of use, scalability, and speed. KGTK represents KGs in tab-separated (TSV) files with four columns: edge-identifier, head, edge-label, and tail. All KGTK commands consume and produce KGs represented in this simple format, so they can be composed into pipelines to perform complex transformations on KGs. KGTK provides:

a suite of import commands to import Wikidata, RDF and popular graph representations into KGTK format;
a rich collection of transformation commands make it easy to clean, union, filter, and sort KGs;
graph combination commands support efficient intersection, subtraction, and joining of large KGs;
a query language using a variant of Cypher, optimized for querying KGs stored on disk supports efficient ad hoc queries;
graph analytics commands support scalable computation of centrality metrics such as PageRank, degrees, connected components and shortest paths;
advanced commands support lexicalization of graph nodes, and computation of multiple variants of text and graph embeddings over the whole graph;
a suite of export commands supports the transformation of KGTK KGs into commonly used formats, including the Wikidata JSON format, RDF triples, JSON documents for ElasticSearch indexing and graph-tool;
a development environment using Jupyter notebooks provides seamless integration with Pandas.

KGTK can process Wikidata-sized KGs with billions of edges on a laptop. We have used KGTK in multiple use cases, focusing primarily on construction of subgraphs of Wikidata, analysis of over 300 Wikidata dumps since the inception of the Wikidata project, linking tables to Wikidata, construction of a commonsense KG combining multiple existing sources, creation of Wikidata extensions for food security and the pharmaceutical industry.

KGTK is open source software, well documented, actively used and developed, and released using the MIT license. We invite the community to try KGTK. It is easy to get started with our tutorial notebooks available and executable online.

Installation

The following instructions install KGTK and the KGTK Jupyter Notebooks on Linux and MacOS systems.

If you want to install KGTK on a Microsoft Windows system, please
contact the KGTK team.

Our KGTK installations use a Conda virtual environment. If you don't have the Conda tools installed, follow this guide to install it. We recommend installing Miniconda installation rather than the full Anaconda installation.

Next, execute the following steps to install the latest stable release of KGTK:

conda create -n kgtk-env python=3.9
conda activate kgtk-env
conda install -c conda-forge graph-tool
conda install -c conda-forge jupyterlab
pip --no-cache install -U kgtk

Please see our installation document for more details. If you encounter problems with your installation, or are interested in a detailed explanation of these commands, read more about the installation procedure here.

Installation issues on Macbooks with M1 chip

Running pip install -e . (development mode) throws an error about 3 libraries,

thinc
blis
tokenizers

Fixed the thinc issue by ,

a. commenting out [this line in requirements.txt](https://github.com/usc-isi-i2/kgtk/blob/dev/requirements.txt#L11)

b. running `pip install thinc-apple-ops`

Fixed the tokenizers issue by running the following commands in the conda environment

# download and install Rust. Follow the on screen instructions

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python/
pip install setuptools_rust
python setup.py install

continue installing kgtk, pip install -e .

Installing KGTK with Docker

Please refer to this document for installing KGTK with Docker

Getting started

Online Documentation

You can read our latest documentation online with:

https://kgtk.readthedocs.io/en/latest/

KGTK Notebooks

For examples of using KGTK, please see our Tutorial Notebooks.

Releases

See all source code releases

KGTK Text Search API

The documentation for the KGTK Text Search API is here

KGTK Semantic Similarity API

The documentation for the KGTK Semantic Similarity API is here

How to cite

@inproceedings{ilievski2020kgtk,
  title={{KGTK}: A Toolkit for Large Knowledge Graph Manipulation and Analysis}},
  author={Ilievski, Filip and Garijo, Daniel and Chalupsky, Hans and Divvala, Naren Teja and Yao, Yixiang and Rogers, Craig and Li, Ronpeng and Liu, Jun and Singh, Amandeep and Schwabe, Daniel and Szekely, Pedro},
  booktitle={International Semantic Web Conference},
  pages={278--293},
  year={2020},
  organization={Springer}
  url={https://arxiv.org/pdf/2006.00088.pdf}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4,527 Commits
.github		.github
augmentation		augmentation
docker		docker
docs		docs
examples		examples
kgtk-properties		kgtk-properties
kgtk		kgtk
tests		tests
tutorial-old		tutorial-old
tutorial		tutorial
use-cases		use-cases
wikidata		wikidata
.gitignore		.gitignore
.pep8speaks.yml		.pep8speaks.yml
.readthedocs.yml		.readthedocs.yml
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
README_dev.md		README_dev.md
kite_tutorial.ipynb		kite_tutorial.ipynb
mkdocs.yml		mkdocs.yml
requirements-dev.txt		requirements-dev.txt
requirements-full.txt		requirements-full.txt
requirements.txt		requirements.txt
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KGTK: Knowledge Graph Toolkit

Installation

Installation issues on Macbooks with M1 chip

Installing KGTK with Docker

Getting started

Online Documentation

KGTK Notebooks

Releases

KGTK Text Search API

KGTK Semantic Similarity API

How to cite

About

Releases 38

Packages

Contributors 24

Languages

License

usc-isi-i2/kgtk

Folders and files

Latest commit

History

Repository files navigation

KGTK: Knowledge Graph Toolkit

Installation

Installation issues on Macbooks with M1 chip

Installing KGTK with Docker

Getting started

Online Documentation

KGTK Notebooks

Releases

KGTK Text Search API

KGTK Semantic Similarity API

How to cite

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 38

Packages 0

Contributors 24

Languages

Packages