An application to aid in normalizing homology data.
See Aon: a service to augment Alliance Genome Resource data with additional species for more information.
Originally forked from this code repository.
This repository uses Poetry to manage dependencies and build packages.
To get started, clone the repository and install the dependencies:
git clone
cd geneweaver-ortholog-normalizer
poetry install
You can activate the virtual environment with:
poetry shell
The flask server can be started by running:
poetry run flask --app geneweaver.aon.app run
The application has a command-line interface for running certain management commands.
gwaon
: /gwɑːn/ (noun) - A command-line interface for the Geneweaver Ortholog Normalizer.
You will need a postgresql database to load the ortholog data into. This has been tested on Postgres 12, but should work on Postgres > 12.
The database schema can be created by running the following command:
poetry run gwaon create-schema
The ortholog data can be loaded by running the following command:
poetry run gwaon load agr
The service connects to the geneweaver database to access the tables gene, species, and genedb.
Update database URLS in:
- flask/src/config.py
- flask/alembic.ini
If an agr database does not exist, create an empty database to store the agr data.
If the AGR tables have not been created, create them with alembic from the flask directory:
cd flask
alembic upgrade head
Download the most recent ORTHO_FILE from https://www.alliancegenome.org/downloads#orthology as TSV.
Make sure to set the ORTHO_FILE
constant in flask/src/service.py to tell the module which file to load.
For now, database loading is achieved by calling the service.py module as a script. This will fill alg_algorithm, gn_gene, hom_homology, ora_ortholog_algorithms, ort_otholog, and sp_species.
python flask/src/service.py
The agr database should be filled. The following command runs the service:
python flask/app.py
If there is missing information, data can be added to the sp_species, gn_gene, and ort_ortholog tables by running the add_missing_info.py script.
Make sure you have set the correct file paths to the flask/missing_info/missing_genes.csv and flask/missing_info/missing_orthologs.csv files.
Then run the following script:
python flask/src/add_missing_info.py
Old Endpoint | New Endpoint |
---|---|
agr_to_geneweaver_species/{sp_id}: | /species/{species_id}/geneweaver_id |
all_algorithms: | /algorithms |
all_genes: | /genes |
all_homology: | /homologs |
all_orthologs: | /orthologs |
all_species: | /species |
get_algorithm_by_name/{alg_name}: | /algorithms?name={name} |
get_from_gene_of_ortholog_by_id/{ort_id}: | /orthologs/{ortholog_id}/genes |
get_gene_species_name/{ode_ref_id}/{ode_gene_id}: | WIP |
get_genes_by_ode_gene_id/{ode_ref_id}/{ode_gene_id}: | WIP |
get_genes_by_prefix/{gn_prefix}: | /genes?prefix={prefix} |
get_genes_by_species/{sp_name}: | /species/{species_id}/genes |
get_homologous_ode_gene_ids_for_gene/{ode_ref_id}/{gdb_name}: | WIP |
get_homology_by_gene/{gn_id}: | /homologs?gene_id={gene_id} |
get_homology_by_gene_and_source/{gn_id}/{hom_source_name}: | /homologs?source={source} |
get_homology_by_id/{hom_id}: | /homologs/{homolog_id} |
get_homology_by_id_and_source/{hom_id}/{hom_source_name}: | /homologs/{homolog_id}?source={source} |
get_homology_by_id_and_species/{hom_id}/{sp_id}: | /homologs/{homolog_id}?species={species} |
get_homology_by_ode_gene_id/{ode_gene_id}: | WIP |
get_homology_by_ode_gene_ids: | WIP |
get_homology_by_species/{sp_id}: | /homologs?species={species} |
get_intersect_by_homology: | WIP |
get_ode_gene_by_gdb_id/{gdb_id}: | WIP |
get_ode_gene_by_gene_id/{ode_gene_id}: | WIP |
get_ode_gene_by_species/{ode_gene_id}/{sp_name}: | WIP |
get_ode_genes_from_hom_id/{hom_id}/{target_gdb_id}: | WIP |
get_ort_id_if_gene_is_ortholog/{ode_gene_id}/{ode_ref_id}: | WIP |
get_ortholog_by_algorithm/{alg_name}: | /orthologs?algorithm={algorithm} |
get_ortholog_by_from_gene_and_gdb/{from_ode_gene_id}/{gdb_id}: | WIP |
get_ortholog_by_from_species/{sp_name}: | /orthologs?from_species={species_id} |
get_ortholog_by_id/{ort_id}: | /orthologs/{ortholog_id} |
get_ortholog_by_to_and_from_species/{to_sp_name}/{from_sp_name}: | /orthologs?to_species={species_id}&from_species={species_id} |
get_ortholog_by_to_from_species_and_algorithm/{to_sp_name}/{from_sp_name}/{alg_name}: | /orthologs?to_species={species_id}&from_species={species_id}&algorithm={algorithm} |
get_ortholog_by_to_species/{sp_name}: | /orthologs?to_species={species_id} |
get_orthologous_species/{ode_gene_id}/{ode_ref_id}: | WIP |
get_orthologs_by_from_gene/{ode_ref_id}/{ode_gene_id}: | WIP |
get_orthologs_by_from_gene_and_best/{from_ode_ref_id}/{from_ode_gene_id}/{best}: | WIP |
get_orthologs_by_from_to_gene_and_best/{from_ode_ref_id}/{from_ode_gene_id}/{to_ode_ref_id}/{to_ode_gene_id}/{best}: | WIP |
get_orthologs_by_from_to_gene_and_revised/{from_ode_ref_id}/{from_ode_gene_id}/{to_ode_ref_id}/{to_ode_gene_id}/{ort_best_revised}: | WIP |
get_orthologs_by_num_algorithms/{num}: | /orthologs?possible_match_algorithms={num_algorithms} |
get_orthologs_by_symbol/{sym}/{orig_species}/{homologous_species}: | WIP |
get_orthologs_by_to_and_from_gene/{from_ode_ref_id}/{from_ode_gene_id}/{to_ode_ref_id}/{to_ode_gene_id}: | WIP |
get_orthologs_by_to_gene/{ode_ref_id}/{ode_gene_id}: | WIP |
get_sp_id_by_hom_id/{hom_id}: | /homologs/{homolog_id}?species={species_id} |
get_species_by_id/{sp_id}: | /species/{species_id} |
get_species_homologs_list: | /species/{species_id}/homologs |
get_to_gene_of_ortholog_by_id/{ort_id}: | WIP |
id_convert_agr_to_ode/{gn_id}: | /genes/{gene_id}/geneweaver_id |
id_convert_ode_to_agr/{ode_gene_id}/{ode_ref_id}: | /geneweaver/genes/{geneweaver_id}/aon_id |
if_gene_has_homolog/{ode_gene_id}: | WIP |
transpose_genes_by_species: | WIP |