Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Biolink 2.0 version of provenance #11

Closed
mmersmann opened this issue Jun 4, 2021 · 13 comments
Closed

Implement Biolink 2.0 version of provenance #11

mmersmann opened this issue Jun 4, 2021 · 13 comments
Assignees

Comments

@mmersmann
Copy link

mmersmann commented Jun 4, 2021

Due: July 1, 2021

Details in architecture repo Git issue here.

@mmersmann
Copy link
Author

Update: We have documentation from EPC group about how this will need to look. It will mean updating our data ingest code to get our edge properties to look correct. May need to update Automat to make sure it's served in demanded format.

@cbizon
Copy link

cbizon commented Jul 2, 2021

This is going to be handled upstream of automat by data services. But leaving here to track.

@mmersmann
Copy link
Author

Status: In progress; all back-end work is finished. Provenance works fine. Plater has been updated to add last bit of provenance.

Issue: Trying to load NEO4Js using KGX, which isn't working well. Phil troubleshooting.

Next Steps:

  1. Build graphs.
  2. Load graphs.

ETA: Early next week, for graphs in Data Services.

@PhillipsOwen
Copy link
Contributor

this has been accomplished for the following plater instances:
CTD
GToPdb
IntAct
uberongraph
HGNC
Human GOA
Drug central

others are on the way.

@mmersmann
Copy link
Author

Everything that has been created by data services has provenance in it. Some data services have KGX file work to be done. ETA one more week.

@PhillipsOwen
Copy link
Contributor

PhillipsOwen commented Aug 4, 2021

the following datasets now have provenance built in. these data sets also support bl model 2.1.0 and have been deployed to automat/plater:

Biolink
CTD
DrugCentral
GtoPdb
Hetio
HGNC
HMDB
HumanGOA
IntAct
PANTHER
PHAROS
UberGraph
Ontological hierarchy

remaining datasets are:
UniRef/viral proteome
GTEx
GWAS
CORD-19
Text Mining KP (blocked)

Robokop KG then needs to be rebuilt.

@PhillipsOwen
Copy link
Contributor

Quick update, as of 8/13/2021

biolink 2.1.0 compatible, latest raw datasets.

Biolink - latest parse installed in automat/plater on 8/6
Cord19 - latest parse installed in automat/plater on 8/13 (replaces Cord19-scigraph and cord19-bites)
CTD - latest parse installed in automat/plater on 8/6
DrugCentral - latest parse installed in automat/plater on 8/6
GtoPdb - latest parse installed in automat/plater on 8/6
Hetio - latest parse installed in automat/plater on 8/6
HGNC - latest parse installed in automat/plater on 8/6
HMDB - latest parse installed in automat/plater on 8/6
HumanGOA - latest parse installed in automat/plater on 8/6
IntAct - latest parse installed in automat/plater on 8/6
PANTHER - latest parse installed in automat/plater on 8/6
PHAROS - latest parse installed in automat/plater on 8/6
UberGraph - latest parse installed in automat/plater on 8/9

in process

ontological-hierarchy - in process
viral-proteome - in process
gtex - in process

next up

gwas - next up.
foodb/foodon - next up. need to incorporate foodon and merge with foodb data

on hold

textminingkp - standing by for updated data from another source
chemical-normalization - needs an expanded dataset
covidkopkg (aggregate) - waiting for complete set of component graph data
robokopkg (aggregate) - waiting for complete set of component graph data

unknown state

mychem - in automat/plater, has DS skeleton, unknown state
topmed - in automat/plater, not in DS, unknown state

@PhillipsOwen
Copy link
Contributor

oops

@PhillipsOwen
Copy link
Contributor

PhillipsOwen commented Aug 20, 2021

Update as of 8/20


2.1.0 biolink compatible platers

Biolink - latest parse installed in automat/plater on 8/17
Cord19 - latest parse installed in automat/plater on 8/17 (replaces Cord19-scigraph and cord19-bites)
CTD - latest parse installed in automat/plater on 8/6
DrugCentral - latest parse installed in automat/plater on 8/17
GtoPdb - latest parse installed in automat/plater on 8/6
Hetio - latest parse installed in automat/plater on 8/17
HGNC - latest parse installed in automat/plater on 8/17
HMDB - latest parse installed in automat/plater on 8/6
HumanGOA - latest parse installed in automat/plater on 8/6
IntAct - latest parse installed in automat/plater on 8/6
Ontological-Hierarchy - latest parse installed in automat/plater on 8/13
PANTHER - latest parse installed in automat/plater on 8/6
PHAROS - latest parse installed in automat/plater on 8/17
UberGraph - latest parse installed in automat/plater on 8/9
ViralProteome/UniRef - latest parse installed in automat/plater on 8/19


In process

GTEx - parsed, needs to be put in a graph and installed in plater


Next up

GWAS - blocked by allele registry, Evan to find work-around


On hold

foodb/foodon - need to incorporate foodon and merge with foodb data
textminingkp - standing by for updated data from another source
chemical-normalization - needs an expanded dataset


Aggregate graphs

Robokop Base - Components ready, needs to be aggregated
Robokop Genetics, Obesityhub - Waiting for GWAS
Covidkop - Waiting on Robokop Genetics

@PhillipsOwen
Copy link
Contributor

PhillipsOwen commented Sep 1, 2021

Update as of 9/3


2.1.0 biolink compatible platers complete

Biolink - latest parse installed in automat/plater on 8/17
Cord19 - latest parse installed in automat/plater on 8/17 (replaces Cord19-scigraph and cord19-bites)
CTD - latest parse installed in automat/plater on 8/6
DrugCentral - latest parse installed in automat/plater on 8/17
GTEx - - latest parse installed in automat/plater on 8/31
GtoPdb - latest parse installed in automat/plater on 8/6
GWAS - latest parse installed in automat/plater on 9/3
Hetio - latest parse installed in automat/plater on 9/1
HGNC - latest parse installed in automat/plater on 8/17
HMDB - latest parse installed in automat/plater on 8/6
HumanGOA - latest parse installed in automat/plater on 8/6
IntAct - latest parse installed in automat/plater on 8/6
Ontological-Hierarchy - latest parse installed in automat/plater on 8/13
PANTHER - latest parse installed in automat/plater on 8/6
PHAROS - latest parse installed in automat/plater on 8/31
UberGraph - latest parse installed in automat/plater on 9/1


On hold

foodb/foodon - need to incorporate foodon and merge with foodb data
textminingkp - standing by for updated data from another source
chemical-normalization - needs an expanded dataset


Aggregate graphs

Robokop Base (pharos, uberongraph and hetio updates, orphan node removal)
Robokop Genetics, Obesityhub - In process - needs node rework
Covidkop - Waiting on Robokop Genetics

@richakanwar13
Copy link

As per meeting on 9/15, can be closed.

@mmersmann
Copy link
Author

@PhillipsOwen to confirm with @YaphetKG that Covidkop is complete. Can then re-close.

@mmersmann mmersmann reopened this Sep 15, 2021
@mmersmann
Copy link
Author

Covidkop confirmed complete. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants