Project 18: Expanding FAIR database integration through elucidation and transformation of underlying graph schemas.

Abstract

The integration of life science data from different biomedical resources has been a major challenge attributed to fragmented data sources, the use of multiple data formats, and the existence of multiple ontologies for a single context among others. To address this problem, we launched the BioDataFuse (BDF) project, which employs a modular framework for integrating data from different sources into context-specific knowledge graphs. Through this project, we have currently been able to integrate and harmonise data from ten databases. However, the integration of such resources requires a detailed understanding of underlying graph schemas.

In this biohackathon, we would like to streamline the data integration process such that any FAIR-compliant biological database can be easily converted to a graph. This robust process would involve two steps: first, understanding of the underlying graph schemas of data resources using the RDF-config (https://github.com/dbcls/rdf-config/) and VoID generator (https://github.com/JervenBolleman/void-generator) and second, the conversion of graph data into multiple compatible formats for improving accessibility and usability using G2G Mapper (https://g2gml.readthedocs.io/), LinkML (https://linkml.io/) and BDF (https://github.com/BioDataFuse/pyBiodatafuse). Moreover, we would test the resilience of the process by demonstrating the ease-of-integration of multiple data sources within the RDF Portal (https://rdfportal.org) and beyond. Through this test, we would essentially attract database owners to include additional biomedical data sources in BDF, thus expanding the applicability of their resource beyond the “yet-another-resource” paradigm.

Project flash presentation

Resources

Project GitHub Repo.
BioDataFuse Web Interface.
BioDataFuse Python package.
BioDataFuse Web Interface codes.
Biohackarvix.
Slack - This will be the main source of communication between in-person and virtual participants throughout the hackathon.

Working ethics

⚖️ The use of GitHub issues and pull requests will be done to ensure the efficient working of multiple people on the GitHub repository.
🚫 No commits to be made directly to the main branch of the GitHub repository.
⚙️ Adding new Python functions should inherently involve writing subsequent unit test functions and documentation for the same.
🤝 The main aim of the hackathon is collaboration, so please feel free to ask questions or provide feedback whenever in doubt. We believe that there are no dumb questions that exist.
📆 To ensure good communication among the team members, we would have two daily stand-ups (pre and post-hacking) allowing all participants to provide a less than 1-minute update on work done and work in the pipeline.

Leads

Name	Affiliation	GitHub	LinkedIn
Tooba Abbassi-Daloii	Maastricht University, NL	@tabbassidaloii	Link
Yojana Gadiya	Fraunhofer ITMP ScreeningPort, DE	@YojanaGadiya	Link

Members

Toshiaki Katayama
Javier Millan Acosta
Egon Willighagen, @egonw, LinkedIn
Dominik Martinat, @dominikmartinat
Shuichi Kawashima

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

18.md

18.md

Project 18: Expanding FAIR database integration through elucidation and transformation of underlying graph schemas.

Table of Contents

Abstract

Project flash presentation

Resources

Working ethics

Leads

Members

Files

18.md

Latest commit

History

18.md

File metadata and controls

Project 18: Expanding FAIR database integration through elucidation and transformation of underlying graph schemas.

Table of Contents

Abstract

Project flash presentation

Resources

Working ethics

Leads

Members