Skip to content

servierhub/top-life-sciences

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Servier Contributed

Top life sciences open source software

This is an automatically generated1 ranked list of open source software from pharmaceutical companies and cross organizations, biotechnology companies, research institutes, open source communities and individuals, plus some life-science software from technological companies.

It's made from a curated list of GitHub accounts, and will be periodically refreshed from these sources' repositories.

You can also access what they have updated lately and which topics are covered by these software.

Ranked by starred repositories

Note

stars - number of people who especially appreciated the repository
forks - number of people who have cloned the repository in order to modify it
watchers - number of people who are monitoring changes in the repository
main programming language
license
last update date & time

Rank Software
1 google-deepmind/alphafold
Open source code for AlphaFold.
11987 2135 226 Python Apache-2.0 license 2023-04-05 09:45:53
2 deepchem/deepchem
Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology
biology, deep-learning, drug-discovery, hacktoberfest, materials-science, quantum-chemistry
5220 1626 Python MIT License 2024-06-08 13:03:11
3 biopython/biopython
Official git repository for Biopython (originally converted from CVS)
bioinformatics, biopython, dna, genomics, phylogenetics, protein, protein-structure, python, sequence-alignment
4213 1728 168 Python Unknown LICENSE
4 google/deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
bioinformatics, deep-learning, deep-neural-network, deepvariant, dna, genome, genomics, machine-learning, ngs, science, sequencing, tensorflow
3100 698 159 Python BSD-3-Clause license 2024-03-19 19:20:10
5 facebookresearch/esm
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
2917 577 63 Python MIT license 2022-10-18 13:38:47
6 aqlaboratory/openfold
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
alphafold2, protein-structure, pytorch
2572 466 Python Apache License 2.0 2024-06-04 08:33:28
7 rdkit/rdkit
The official sources for the RDKit library
c-plus-plus, cheminformatics, python, rdkit
2483 845 HTML BSD 3-Clause "New" or "Revised" License 2024-06-08 03:18:22
8 AstraZeneca/awesome-explainable-graph-reasoning
A collection of research papers and software related to explainability in graph machine learning.
awesome-list, deep-learning, explainable-ai, explainable-ml, graph, graph-algorithms, graphml
1941 129 Apache License 2.0 2022-04-04 14:54:08
9 OpenGene/fastp
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
adapter, bioinformatics, duplication, fastq, filter, filtering, illumina, merging, ngs, overlap, polyg, preprocessing, qc, quality, quality-control, sequencing, splitting, trimming, umi
1803 333 C++ MIT License 2024-04-07 08:16:11
10 scverse/scanpy
Single-cell analysis in Python. Scales to >1M cells.
anndata, bioinformatics, data-science, machine-learning, python, scanpy, scverse, transcriptomics, visualize-data
1789 579 Python BSD 3-Clause "New" or "Revised" License 2024-06-07 08:43:34
11 lh3/minimap2
A versatile pairwise aligner for genomic and spliced nucleotide sequences
bioinformatics, genomics, sequence-alignment, spliced-alignment
1708 396 C Other 2024-05-22 19:58:33
12 allenai/scispacy
A full spaCy pipeline and models for scientific/biomedical documents.
bioinformatics, biomedical, custom-pipes, nlp, scientific-documents, spacy
1629 221 52 Python Apache-2.0 license 2024-03-08 05:57:56
13 broadinstitute/gatk
Official code repository for GATK versions 4 and up
bioinformatics, dna, gatk, genome, genomics, ngs, science, sequencing, spark
1621 577 156 Java specific 2023-12-13 22:53:56
14 bioconda/bioconda-recipes
Conda recipes for the bioconda channel.
bioinformatics, conda, hacktoberfest, package-management
1595 3089 96 Shell MIT license
15 samtools/samtools
Tools (written in C using htslib) for manipulating next-generation sequencing data
1572 572 C Other 2024-06-07 09:32:59
16 Slicer/Slicer
Multi-platform, free open source software for visualization and image computing.
3d-printing, 3d-slicer, c-plus-plus, computed-tomography, image-guided-therapy, image-processing, itk, kitware, medical-image-computing, medical-imaging, national-institutes-of-health, neuroimaging, nih, python, qt, registration, segmentation, tcia-dac, tractography, vtk
1521 520 38 C++ specific
17 lh3/bwa
Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
bioinformatics, fm-index, genomics, sequence-alignment
1468 547 C GNU General Public License v3.0 2024-04-15 02:54:32
18 DeepGraphLearning/torchdrug
A powerful and flexible machine learning platform for drug discovery
deep-learning, drug-discovery, graph-neural-networks, pytorch
1407 194 31 Python Apache-2.0 license 2023-07-16 22:37:17
19 lh3/seqtk
Toolkit for processing sequences in FASTA/Q formats
bioinformatics, sequence-analysis
1332 310 C MIT License 2023-10-24 15:01:39
20 galaxyproject/galaxy
Data intensive science for everyone.
bioinformatics, dna, genomics, hacktoberfest, ngs, pipeline, science, sequencing, usegalaxy, workflow, workflow-engine
1329 967 69 Python specific 2024-05-07 13:56:26
21 schrodinger/fixed-data-table-2
A React table component designed to allow presenting millions of rows of data.
1290 289 JavaScript Other 2024-05-23 05:13:10
22 soedinglab/MMseqs2
MMseqs2: ultra fast and sensitive search and clustering suite
alignment, bioinformatics, blast, linclust, metagenomics, mmseqs, profile-search, sequence-clustering, sequence-search, taxonomy
1281 181 C GNU General Public License v3.0 2024-05-23 07:07:21
23 facebookresearch/fastMRI
A large-scale dataset of both raw MRI measurements and clinical MRI images.
convolutional-neural-networks, deep-learning, fastmri, fastmri-challenge, fastmri-dataset, medical-imaging, mri, mri-reconstruction, pytorch
1259 370 74 Python MIT license 2023-06-26 17:17:06
24 greenelab/deep-review
A collaboratively written review paper on deep learning, genomics, and precision medicine
deep-learning, genomics, manubot, manuscript, neural-networks, review
1235 271 129 HTML Unknown LICENSE.md 2018-03-12 15:06:48
25 shenwei356/seqkit
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
bioinformatics, cross-platform, fasta, fastq, golang, manipulation, sequence, tool, toolkit
1226 157 26 Go MIT license 2024-05-17 15:59:35
26 MultiQC/MultiQC
Aggregate results from bioinformatics analyses across many samples into a single report.
analysis, bioconda, bioinformatics, data-visualization, multiqc, pypi, python, quality-control, reporting, seqera, vizualisation
1185 582 37 JavaScript GPL-3.0 license 2024-05-31 18:30:12
27 dcm4che/dcm4che
DICOM Implementation in JAVA
1165 637 119 Java specific 2024-04-22 10:59:11
28 scverse/scvi-tools
Deep probabilistic analysis of single-cell and spatial omics data
cite-seq, deep-generative-model, deep-learning, human-cell-atlas, scrna-seq, scverse, single-cell-genomics, single-cell-rna-seq, variational-autoencoder, variational-bayes
1149 342 Python BSD 3-Clause "New" or "Revised" License 2024-06-05 17:01:13
29 vgteam/vg
tools for working with genome variation graphs
dna, genome-graph, genomics, graph, variation-graph
1072 191 48 C++ specific 2024-05-20 18:50:28
30 schrodinger/pymol-open-source
Open-source foundation of the user-sponsored PyMOL molecular visualization system.
1071 260 C Other 2024-06-06 19:36:48
31 scipipe/scipipe
Robust, flexible and resource-efficient pipelines using Go and the commandline
bioinformatics, bioinformatics-pipeline, cheminformatics, dataflow, fbp, go, golang, pipeline, scientific-workflows, scipipe, workflow, workflow-engine
1055 72 38 Go MIT license 2021-10-14 09:11:34
32 shenwei356/csvtk
A cross-platform, efficient and practical CSV/TSV toolkit in Golang
bioinformatics, command-line, cross-platform, csv, golang, tool, toolkit, tsv
972 85 25 Go MIT license 2024-05-29 15:30:38
33 bigdatagenomics/adam
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
avro, big-data, bioinformatics, genomics, java, parquet, python, r, scala, spark
967 304 Scala Apache License 2.0 2024-03-23 13:27:52
34 broadinstitute/cromwell
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
application, bioinformatics, cloud, containers, docker, executor, ga4gh, hpc, scala, wdl, workflow, workflow-description-language, workflow-execution
965 351 112 Scala BSD-3-Clause LICENSE.txt 2024-05-07 17:47:13
35 hail-is/hail
Cloud-native genomic dataframes and batch computing
bioinformatics, genetics, genomics, gwas, hail, python, software, vcf
946 238 55 Python MIT license 2024-06-05 17:48:05
36 broadinstitute/picard
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
944 365 160 Java MIT license 2023-11-14 22:01:18
37 aqlaboratory/proteinnet
Standardized data set for machine learning of protein structure
dataset, deep-learning, machine-learning, protein-sequence, protein-structure, proteins
849 130 Python MIT License 2020-11-18 23:43:32
38 shenwei356/rush
A cross-platform command-line tool for executing jobs in parallel
bioinformatics, command, cross-platform, execute, golang, parallel, pipeline, shell, windows
834 63 20 Go MIT license 2023-11-13 17:53:58
39 evo-design/evo
DNA foundation modeling from molecular to genome scale
832 97 Jupyter Notebook Apache License 2.0 2024-04-30 22:35:34
40 PaddlePaddle/PaddleHelix
Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集
biocomputing, ddi, deeplearning, dti, graph-networks, machine-learning, molecule-design, ppi, protein-design, protein-docking, protein-folding, protein-structure-prediction, representation-learning, rna-structure-prediction, self-supervised-learning
799 188 25 Python Apache-2.0 license 2023-08-01 09:31:36
41 samtools/htslib
C library for high-throughput sequencing data formats
bam, bcf, bioinformatics, cram, htslib, ngs, sam, vcf
779 448 C Other 2024-06-06 15:40:15
42 google/nucleus
Python and C++ code for reading and writing genomics data.
bioinformatics, dna, genomics, tensorflow
777 126 53 C++ specific 2021-08-31 23:19:33
43 nroduit/Weasis
Weasis is a DICOM viewer available as a desktop application or as a web-based application.
dicom, dicom-image, dicom-image-viewer, dicom-images, dicom-pr, dicom-rt, dicom-seg, dicom-viewer, dicom-web-viewer, dicomweb, ecg, export-dicom, medical, medical-imaging, multiplanar-reconstruction, viewer, volume-rendering, weasis
763 281 49 Java specific 2024-05-06 18:42:54
44 baidu-research/NCRF
Cancer metastasis detection with neural conditional random field (NCRF)
camelyon16, conditional-random-fields, deep-learning, pathology, whole-slide-imaging
749 184 37 Python Apache-2.0 license 2018-06-17 18:22:34
45 AstraZeneca/chemicalx
A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
biology, chemistry, deep-chemistry, deep-learning, drug, drug-discovery, drug-interaction, drug-pair, geometric-deep-learning, geometry, graph-neural-network, machine-learning, pharma, polypharmacy, pytorch, smiles, smiles-strings, torch, torchdrug
701 89 Python Apache License 2.0 2023-09-11 08:01:43
46 samtools/hts-specs
Specifications of SAM/BAM and related high-throughput sequencing file formats
627 173 TeX 2024-06-06 06:50:26
47 samtools/bcftools
This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
626 241 C Other 2024-06-07 13:13:17
48 insilicomedicine/GENTRL
Generative Tensorial Reinforcement Learning (GENTRL) model
596 216 Python 2020-04-28 11:58:05
49 shenwei356/awesome
Awesome resources on Bioinformatics, data science, machine learning, programming language (Python, Golang, R, Perl) and miscellaneous stuff.
awesome, data-science, git, golang, linux, perl, programing-language, python
593 163 35 MIT license 2023-09-25 02:09:01
50 chanzuckerberg/cellxgene
An interactive explorer for single-cell transcriptomics data
dataviz, scientific, scrna-seq, transcriptomics, visualization
591 111 33 JavaScript MIT license 2023-12-19 22:19:07
51 invesalius/invesalius3
3D medical imaging reconstruction software
584 277 37 Python GPL-2.0 license 2022-04-14 02:28:31
52 lh3/bioawk
BWK awk modified for biological data
bioinformatics, sequence-analysis
582 121 C 2022-08-11 01:06:45
53 MolecularAI/aizynthfinder
A tool for retrosynthetic planning
astrazeneca, chemical-reactions, cheminformatics, monte-carlo-tree-search, neural-networks, reaction-informatics
548 125 Python MIT License 2024-06-03 13:34:33
54 owkin/PyDESeq2
A Python implementation of the DESeq2 pipeline for bulk RNA-seq DEA.
bioinformatics, differential-expression, python, rna-seq, transcriptomics
533 58 Python MIT License 2024-06-06 01:43:52
55 broadinstitute/infercnv
Inferring CNV from Single-Cell RNA-Seq
520 159 42 R specific 2020-02-07 20:29:28
56 scverse/anndata
Annotated data.
anndata, bioinformatics, data-science, machine-learning, scanpy, scverse, transcriptomics
511 148 Python BSD 3-Clause "New" or "Revised" License 2024-06-07 16:03:50
57 soedinglab/hh-suite
Remote protein homology detection suite.
alignment, bioinformatics, cpp, hh-suite, hhblits, hhpred, hhsearch, opensource, profile-profile-search, profile-search, protein-structure, sequence-search, simd, viterbi
509 128 C GNU General Public License v3.0 2023-08-13 08:44:05
58 chhylp123/hifiasm
Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
bioinformatics, denovo-assembly, genomics, hifi-read, pacbio
490 84 28 C++ MIT license 2024-05-06 14:29:45
59 insitro/redun
Yet another redundant workflow engine
aws, bioinformatics, data-engineering, data-science, docker, etl, gcp, ml, python, workflow-engine
489 40 Python Apache License 2.0 2024-06-06 18:52:56
60 biosustain/potion
Flask-Potion is a RESTful API framework for Flask and SQLAlchemy, Peewee or MongoEngine
flask, flask-extensions, mongoengine, peewee, sqlalchemy
488 51 Python Other 2019-04-23 17:00:39
61 google-deepmind/alphamissense
461 58 25 Python Apache-2.0 license
62 scverse/squidpy
Spatial Single Cell Analysis in Python
data-visualization, image-analysis, single-cell-genomics, single-cell-rna-seq, spatial-analysis, spatial-transcriptomics, squidpy
399 71 Python BSD 3-Clause "New" or "Revised" License 2024-06-08 21:22:47
63 lh3/minigraph
Sequence-to-graph mapper and graph generator
bioinformatics, genome-graph, genomics, pan-genome, sequence-alignment
394 38 C MIT License 2024-05-22 00:59:12
64 benevolentAI/guacamol
Benchmarks for generative chemistry
383 82 Python MIT License 2024-02-11 08:59:38
65 calico/basenji
Sequential regulatory activity predictions with deep convolutional neural networks.
373 119 Python Apache License 2.0 2024-05-28 20:08:23
66 ome/bioformats
Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
bio-formats, format-converter, format-reader, image, java, life-sciences-image, lightsheet, metadata, whole-slide-imaging, wsi
367 239 Java GNU General Public License v2.0 2024-06-07 19:34:33
67 MolecularAI/GraphINVENT
Graph neural networks for molecular design.
356 74 Python MIT License 2023-03-11 11:55:32
67 chembl/chembl_webresource_client
Official Python client for accessing ChEMBL API
chembl, cheminformatics, chemistry, chemoinformatics, python, rest, rest-client
356 95 Python Other 2024-02-26 15:44:57
68 shenwei356/taxonkit
A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV
bioinformatics, cross-platform, lca, lineage, taxdump, taxid, taxonkit, taxonomy
342 29 10 Go MIT license 2024-04-25 17:15:34
69 deepchem/DeepLearningLifeSciences
Example code from the book "Deep Learning for the Life Sciences"
338 150 Jupyter Notebook MIT License 2021-09-17 05:10:37
70 MolecularAI/Reinvent
astrazeneca, cheminformatics, denovo-design, neural-networks, reinforcement-learning, transfer-learning
332 108 Python Apache License 2.0 2023-10-19 05:26:16
71 aqlaboratory/rgn
Recurrent Geometric Networks for end-to-end differentiable learning of protein structure
deep-learning, deep-neural-networks, protein-structure, protein-structure-prediction
326 89 Python MIT License 2019-08-01 14:17:59
72 tencent-ailab/grover
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data
313 68 7 Python specific 2021-01-18 09:06:32
73 lh3/miniprot
Align proteins to genomes with splicing and frameshift
bioinformatics, sequence-alignment
305 16 C MIT License 2024-04-12 21:01:25
74 Roche/pyreadstat
Python package to read sas, spss and stata files into pandas data frames. It is a wrapper for the C library readstat.
conversion, pandas-dataframe, python, readstat, sas7bdat, spss, stata-files
303 55 C Other 2024-06-04 09:55:07
75 lh3/miniasm
Ultrafast de novo assembly for long noisy reads (though having no consensus step)
bioinformatics, denovo-assembly, genomics
293 68 TeX MIT License 2023-12-13 01:35:58
76 chanzuckerberg/MedMentions
A corpus of Biomedical papers annotated with mentions of UMLS entities.
291 31 25
77 AstraZeneca/rexmex
A general purpose recommender metrics library for fair evaluation.
coverage, deep-learning, evaluation, machine-learning, metric, metrics, mrr, personalization, precision, rank, ranking, recall, recommender, recommender-system, recsys, rsquared
275 25 Python 2023-08-22 09:22:20
78 samtools/htsjdk
A Java API for high-throughput sequencing data (HTS) formats.
bam, cram, dna, fasta, genomics, java, java-api, ngs, sam, sequencing, vcf
274 244 Java 2024-06-04 18:40:43
79 shenwei356/brename
A practical cross-platform command-line tool for safely batch renaming files/directories via regular expression
batch, batch-rename, batch-rename-files, batch-renamer, go, golang, rename, safe, windows
254 21 6 Go MIT license 2024-04-14 08:22:45
80 lh3/wgsim
Reads simulator
bioinformatics, genomics
252 90 C 2021-09-03 14:58:22
81 Acellera/htmd
HTMD: Programming Environment for Molecular Discovery
automate, drug-discovery, htmd, molecular-simulations
250 58 Rich Text Format Other 2024-06-07 15:24:26
82 DeepGraphLearning/GearNet
GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
graph-neural-networks, pre-training, protein-representation-learning
249 26 10 Python MIT license
83 MolecularAI/REINVENT4
AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
ai, astrazeneca, cheminformatics, chemistry, deep-learning, denovo-design, drug-design, drug-discovery, generative-ai, ml, molecule-generation, neural-networks, reinforcement-learning, transfer-learning
247 57 Python Apache License 2.0 2024-04-27 11:00:08
84 rdkit/rdkit-tutorials
Tutorials to learn how to work with the RDKit
239 71 Jupyter Notebook Other 2023-03-19 13:36:55
85 insightsengineering/rtables
Reporting tables with R
pharmaceuticals, r, tables
213 49 R Other 2024-06-07 21:27:39
86 Bayer-Group/cloudformation-template-generator
A type-safe Scala DSL for generating CloudFormation templates
211 71 Scala BSD 3-Clause "New" or "Revised" License 2022-07-29 11:32:04
87 pharmaverse/admiral
ADaM in R Asset Library
cdisc, clinical-trials, open-source, r
207 53 R Apache License 2.0 2024-06-07 18:23:44
87 OpenGene/awesome-bio-datasets
awesome-bio-datasets
207 42 MIT License 2017-10-28 12:32:15
88 OpenGene/AfterQC
Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data
adapter-trimming, bioinformatics, error, fastq, filtering, ngs, overlap, qc, quality-control, sequencing, trimming
203 50 Python MIT License 2020-05-14 07:15:54
89 Bayer-Group/etcd-aws-cluster
A container to assist in managing a etcd2 cluster from an Amazon auto scaling group
202 102 Shell BSD 3-Clause "New" or "Revised" License 2017-02-01 01:09:05
89 modernatx/seqlike
Unified biological sequence manipulation in Python
biological-sequences, biopython, machine-learning, sequence
202 18 Python Apache License 2.0 2024-02-16 13:13:05
89 scverse/scirpy
A scanpy extension to analyse single-cell TCR and BCR data.
202 31 Python BSD 3-Clause "New" or "Revised" License 2024-06-06 06:21:35
90 lh3/gfatools
Tools for manipulating sequence graphs in the GFA and rGFA formats
bioinformatics, genome-graph, genomics
201 18 C 2024-02-20 15:29:14
90 scverse/muon
muon is a multimodal omics Python framework
anndata, cite-seq, mudata, multi-omics, multimodal-data, multimodal-omics-analysis, muon, scanpy, scatac-seq, scrna-seq, scverse
201 28 Python BSD 3-Clause "New" or "Revised" License 2024-05-30 21:21:35
91 aws-samples/aws-batch-genomics
Software sets up and runs an genome sequencing analysis workflow using AWS Batch and AWS Step Functions.
199 75 39 Python Apache-2.0 license 2018-11-29 18:40:42
92 rdkit/mmpdb
A package to identify matched molecular pairs and use them to predict property changes.
195 53 Python Other 2024-04-30 10:55:30
93 Acellera/moleculekit
MoleculeKit: Your favorite molecule manipulation kit
drug-discovery, machine-learning, molecular-modeling, molecular-simulation, molecule, proteins
193 35 Python Other 2024-06-04 13:53:30
94 bioinform/somaticseq
An ensemble approach to accurately detect somatic mutations using SomaticSeq
cancer-genomics, somatic-variants
189 53 Python BSD 2-Clause "Simplified" License 2024-05-30 07:55:34
95 MolecularAI/Chemformer
188 34 Python Apache License 2.0 2024-05-29 14:43:33
96 owkin/FLamby
Cross-silo Federated Learning playground in Python. Discover 7 real-world federated datasets to test your new FL strategies and try to beat the leaderboard.
dataset, deep-learning, differential-privacy, federated-learning, healthcare, machine-learning, python
187 22 Python MIT License 2024-06-03 12:18:27
96 ome/openmicroscopy
OME (Open Microscopy Environment) develops open-source software and data format standards for the storage and manipulation of biological light microscopy data. A joint project between universities, research establishments and industry in Europe and the USA, OME has over 20 active researchers with strong links to the microscopy community. Funded …
database, image, java, omero, python, server
187 100 Java GNU General Public License v2.0 2024-06-08 00:39:30
97 AstraZeneca-NGS/VarDict
VarDict
186 60 Perl MIT License 2024-01-05 14:06:13
97 scverse/spatialdata
An open and interoperable data framework for spatial omics data
186 34 Python BSD 3-Clause "New" or "Revised" License 2024-06-08 00:23:48
98 haowenz/chromap
Fast alignment and preprocessing of chromatin profiles
bioinformatics, chromatin-profiles, genomics, sequence-analysis
184 18 7 C++ MIT license 2024-02-06 15:29:20
99 chao1224/MoleculeSTM
Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
clip, computation-chemistry, drug-discovery, editing, foundation-model, molecule-editing, moleculeclip, moleculestm, pretraining, retrieval
182 17 4 Python specific 2024-04-19 05:25:24
100 openpharma/visR
A package to wrap functionality for plots, tables and diagrams adhering to graphical principles.
179 32 R Other 2024-06-04 13:48:59
100 chembl/ChEMBL_Structure_Pipeline
ChEMBL database structure pipelines
179 38 Python MIT License 2023-10-25 15:20:47
101 AstraZeneca/awesome-drug-discovery-knowledge-graphs
A collection of research papers, datasets and software related to knowledge graphs for drug discovery. Accompanies the paper "A review of biomedical datasets relating to drug discovery: a knowledge graph perspective" (Briefings in Bioinformatics, 2022)
awesome-list, drug-discovery, drug-discovery-knowledge-graph, knowledge-graph
177 19 Apache License 2.0 2023-09-10 16:33:40
102 lh3/biofast
Benchmarking programming languages/implementations for common tasks in Bioinformatics
bioinformatics
175 26 C 2021-12-09 14:10:44
103 shenwei356/kmcp
Accurate metagenomic profiling && Fast large-scale sequence/genome searching
bigsi, cobs, fracminhash, kmer, metagenomics, scaled-minhash, searching, sketch, sketching, syncmers, taxonomic-classification, taxonomic-profiling, virome
173 13 6 Go MIT license 2023-09-22 04:09:54
104 rgcgithub/regenie
regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
172 49 C++ Other 2024-04-03 13:52:31
105 soedinglab/metaeuk
MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics
bioinformatics, eukaryotes, gene-discovery, gene-prediction, metagenomics
171 24 C GNU General Public License v3.0 2024-05-30 09:04:06
106 recursionpharma/gflownet
GFlowNet library specialized for graph & molecular data
deep-learning, gflownet, graph-neural-network, pytorch
168 34 Python MIT License 2024-06-06 13:29:06
106 scverse/scanpy-tutorials
Scanpy Tutorials.
168 113 Jupyter Notebook 2024-06-03 19:42:01
107 bioinform/neusomatic
NeuSomatic: Deep convolutional neural networks for accurate somatic mutation detection
convolutional-neural-networks, deep-learning, genomics, somatic-variants
167 50 Python Other 2021-12-23 10:41:50
108 lh3/readfq
Fast multi-line FASTA/Q reader in several programming languages
bioinformatics, sequence-analysis
166 60 C 2021-06-06 07:27:15
109 insightsengineering/teal
Exploratory Web Apps for Analyzing Clinical Trial Data
clinical-trials, nest, r, shiny, webapp
164 29 R Other 2024-06-07 12:49:26
110 lh3/cgranges
A C/C++ library for fast interval overlap queries (with a "bedtools coverage" example)
algorithm, bioinformatics, genomics
161 18 C MIT License 2024-05-28 21:47:37
110 lh3/kmer-cnt
Code examples of fast and simple k-mer counters for tutorial purposes
bioinformatics, genomics, k-mer-counting
161 13 C++ MIT License 2020-03-10 16:24:06
111 greenelab/tybalt
Training and evaluating a variational autoencoder for pan-cancer gene expression data
analysis, autoencoder, cancer, cancer-genomics, deep-learning, gene-expression, script, tool, unsupervised-learning, variational-autoencoder, variational-autoencoders
159 62 10 HTML BSD-3-Clause license 2017-11-13 13:38:42
112 aqlaboratory/genie
De Novo Protein Design by Equivariantly Diffusing Oriented Residue Clouds
diffusion-models, protein-design
154 18 Python Apache License 2.0 2024-04-21 13:48:25
113 DeepGraphLearning/ConfGF
Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).
153 34 10 Python MIT license
114 benevolentAI/DeeplyTough
DeeplyTough: Learning Structural Comparison of Protein Binding Sites
3d-models, deep-learning, drug-discovery, metric-learning, protein-structure
151 39 Python Other 2023-04-07 09:33:44
115 chao1224/GraphMVP
Pre-training Molecular Graph Representation with 3D Geometry, ICLR'22 (https://openreview.net/forum?id=xQUe1pOKPam)
contrastive-learning, generative-model, geometry, graph, molecule, pretraining, self-supervised, self-supervised-learning
150 20 5 Python MIT license 2022-09-20 14:29:48
116 OpenGene/MutScan
Detect and visualize target mutations by scanning FastQ files directly
bioinformatics, cancer, detection, fastq, mutation, ngs, somatic, validation, variant, visualization
146 38 C MIT License 2022-02-10 01:52:44
117 MolecularAI/ReinventCommunity
astrazeneca, cheminformatics, denovo-design, jupyter-notebook, neural-networks, reinforcement-learning, transfer-learning
145 57 Jupyter Notebook MIT License 2022-04-22 16:44:35
117 lh3/psmc
Implementation of the Pairwise Sequentially Markovian Coalescent (PSMC) model
bioinformatics, genomics, population-genetics
145 60 C Other 2022-11-21 04:39:31
117 tencent-ailab/DrugOOD
OOD Dataset Curator and Benchmark for AI-aided Drug Discovery
145 19 6 Python specific
118 ome/ome-zarr-py
Implementation of next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
ngff, ome, ome-zarr, zarr
143 51 Python Other 2024-06-06 12:51:57
119 Novartis/tidymodules
An Object-Oriented approach to Shiny modules
communication, inheritance, oop, r, shiny, shiny-modules, tidy-operators
141 11 R Other 2023-02-23 15:04:31
120 aws-samples/aws-genomics-workflows
Genomics Workflows on AWS
aws, batch, genomics, step-functions, workflows
140 106 19 Shell MIT-0 license 2022-03-30 21:38:09
121 MolecularAI/deep-molecular-optimization
Molecular optimization by capturing chemist’s intuition using the Seq2Seq with attention and the Transformer
molecular-optimization, multi-property-optimization, seq2seq, transformer
139 36 Python Apache License 2.0 2023-03-16 07:05:06
122 AstraZeneca/SubTab
The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"
contrastive-learning, multi-view-learning, representation-learning, self-supervised-learning, tabular-data
138 20 Python Apache License 2.0 2022-07-01 09:03:38
122 johnsonandjohnson/Bodiless-JS
Framework for building editable websites on the JAMStack
138 59 TypeScript Apache License 2.0 2024-01-24 03:00:32
123 Benson-Genomics-Lab/TRF
Tandem Repeats Finder: a program to analyze DNA sequences
137 24 C GNU Affero General Public License v3.0 2023-01-16 20:44:26
124 lh3/pangene
Constructing a pangenome gene graph
bioinformatics, pangenome
136 7 C 2024-05-29 00:13:01
125 owkin/HistoSSLscaling
Code associated to the publication: Scaling self-supervised learning for histopathology with masked image modeling, A. Filiot et al., MedRxiv (2023). We publicly release Phikon 🚀
computational-pathology
135 11 Jupyter Notebook Other 2024-01-29 22:35:32
126 AstraZeneca/awesome-shapley-value
Reading list for "The Shapley Value in Machine Learning" (JCAI 2022)
artificial-intelligence, data-science, deep-learning, explainability, explainable, explainable-ai, explainable-artificial-intelligence, explainable-ml, lime, machine-learning, owen-value, shap, shapley, shapley-additive-explanations, shapley-decomposition, shapley-q-value, shapley-value, xai
134 10 Apache License 2.0 2022-08-08 08:53:10
127 lh3/bedtk
A simple toolset for BED files (warning: CLI may change before bedtk becomes stable)
bioinformatics
132 15 C MIT License 2024-05-28 21:48:28
128 Bioconductor/Contributions
Contribute Packages to Bioconductor
bioconductor
131 33 2023-09-12 18:32:10
129 Merck/BioPhi
BioPhi is an open-source antibody design platform. It features methods for automated antibody humanization (Sapiens), humanness evaluation (OASis) and an interface for computer-assisted antibody sequence design.
antibody, humanization, humanness, oasis, sapiens
129 44 Python MIT License 2024-06-03 07:17:18
129 soedinglab/plass
sensitive and precise assembly of short sequencing reads
bioinformatics, metagenomics, metatranscriptomics, opensource, proteins, proteomics, sequence-assembler
129 14 C GNU General Public License v3.0 2024-04-16 20:44:12
130 benevolentAI/guacamol_baselines
Baselines models for GuacaMol benchmarks
128 33 Python MIT License 2024-02-16 09:40:42
131 AstraZeneca-NGS/VarDictJava
VarDict Java port
127 52 Java MIT License 2024-01-05 14:03:51
132 lh3/ksw2
Global alignment and alignment extension
bioinformatics, sequence-alignment
124 24 C Other 2023-06-27 17:21:12
132 chao1224/ChatDrug
LLM for Drug Editing, ICLR 2024
chatgpt, chatgpt3, conversation, domain-feedback, drug, drug-discovery, drug-editing, editing, llm, molecule, motif, peptide, protein, retrieval, secondary-structure, small-molecule, structure
124 8 3 Python 2024-05-28 19:44:44
133 rdkit/rdkit-js
A powerful cheminformatics and molecule rendering toolbelt for JavaScript, powered by RDKit .
cheminformatics, drug-discovery, javascript, molecule, molecule-viewer, molecule-visualization, node-js, npm, rdkit, react, wasm
123 35 Dockerfile BSD 3-Clause "New" or "Revised" License 2024-06-01 09:54:52
133 blazerye/DrugAssist
DrugAssist: A Large Language Model for Molecule Optimization
ai-for-science, drug-discovery, instruction-datasets, instruction-tuning, large-language-models, molecule-generation, molecule-optimization
123 10 3 Python
134 bigdatagenomics/mango
A scalable genome browser. Apache 2 licensed.
122 30 Scala Apache License 2.0 2022-12-02 22:21:57
135 OpenGene/repaq
A fast lossless FASTQ compressor with ultra-high compression ratio
120 20 C MIT License 2023-09-22 02:48:34
136 Bioconductor/BiocStickers
Stickers for some Bioconductor packages - feel free to contribute and/or modify.
bioconductor, stickers
119 86 R Other 2024-05-10 05:58:21
136 greenelab/pancancer
Building classifiers using cancer transcriptomes across 33 different cancer-types
analysis, cancer, classifier, gene-expression, machine-learning, methodology, pancancer, tcga, tool, transcriptome
119 58 10 Jupyter Notebook BSD-3-Clause license 2018-03-01 15:38:33
137 Roche/BalancedLossNLP
118 23 Jupyter Notebook Other 2023-06-12 21:51:15
138 Merck/deepbgc
BGC Detection and Classification Using Deep Learning
bidirectional-lstm, biosynthetic-gene-clusters, deep-learning, deepbgc, natural-products, pfam2vec, python, synthetic-biology
117 26 Jupyter Notebook MIT License 2023-11-11 12:48:56
138 benevolentAI/MolBERT
117 35 Python MIT License 2021-06-06 10:28:35
139 genentech/equifold
Official code repository for EquiFold: Protein Structure Prediction with a Novel Coarse-Grained Structure Representation
machine-learning, proteins, structural-biology, structure-prediction
116 15 Python Apache License 2.0 2023-01-08 19:51:30
140 OpenGene/GeneFuse
Gene fusion detection and visualization
alk, bioinformatics, cancer, cosmic, eml4, fusion, gene, ret, ros1
114 62 C MIT License 2022-02-21 08:07:06
141 biosustain/cameo
cameo - computer aided metabolic engineering & optimization
113 42 Python Apache License 2.0 2022-11-07 14:54:19
142 EBI-Metagenomics/emg-viral-pipeline
VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
cwl, nextflow, pipeline, viruses, workflow
109 13 Python Apache License 2.0 2024-05-08 20:10:03
142 OpenGene/gencore
Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
bioinformatics, consensus, deduplication, deep-sequencing, duplex, duplex-sequencing, duplication, ngs, sequencing, sequencing-error, sequencing-noise, somatic
109 32 C++ MIT License 2023-10-27 06:19:21
142 OpenGene/fastv
An ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data. This tool can be used to detect viral infectious diseases, like COVID-19.
2019-ncov, bioinformatics, coronavirus, covid, covid-19, hcov, meta-genomics, microbial-sequences, mngs, ngs, sars-cov-2, sequencing, viral, viral-infectious-diseases, virus, visualization
109 24 C++ MIT License 2023-10-27 06:16:38
143 lh3/yak
Yet another k-mer analyzer
bioinformatics, k-mer
108 8 C MIT License 2024-04-01 21:39:44
143 lh3/fermikit
De novo assembly based variant calling pipeline for Illumina short reads
bioinformatics, denovo-assembly, genomics, variant-calling
108 23 TeX Other 2020-11-30 22:57:56
144 Merck/Halyard
Halyard is an extremely horizontally scalable Triplestore with support for Named Graphs, designed for integration of extremely large Semantic Data Models, and for storage and SPARQL 1.1 querying of the whole Linked Data universe snapshots.
107 17 Java Apache License 2.0 2023-01-23 16:59:32
144 ome/ngff
Next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
bioimaging, cloud, data-science, file-formats, spec
107 38 Bikeshed Other 2024-06-02 06:26:47
144 soedinglab/CCMpred
Protein Residue-Residue Contacts from Correlated Mutations predicted quickly and accurately.
107 25 C GNU Affero General Public License v3.0 2023-11-08 07:51:35
145 lh3/minimap
This repo is DEPRECATED. Please use minimap2, the successor of minimap.
106 29 C MIT License 2017-09-20 14:15:02
146 chao1224/Geom3D
Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023
3d, 3d-structures, ai4science, biology, chemistry, crystals, drugs, equivariance, geometry, group, invariance, material, molecules, physics, proteins, symmetry
105 9 2 Python MIT license 2024-06-05 03:18:58
147 phuse-org/phuse-scripts
Delivery standard industry analyses, built upon CDISC standards for analysis data
104 88 SAS MIT License 2023-08-01 15:21:20
147 chembl/FPSim2
Simple package for fast molecular similarity searches
cheminformatics, chemistry, gpu, python, similarity-search
104 17 Python MIT License 2024-02-15 11:13:05
148 bayer-science-for-a-better-life/Img2Mol
103 41 Jupyter Notebook Apache License 2.0 2023-03-24 18:07:41
149 Biogen-Inc/tidyCDISC
Demo the app here: https://bit.ly/tidyCDISC_app
pharma, r, rinpharma, rstats
102 38 R GNU Affero General Public License v3.0 2023-09-22 15:18:20
150 openpharma/mmrm
Mixed Models for Repeated Measures (MMRM) in R.
100 17 R Other 2024-06-03 18:02:15
150 MolecularAI/DockStream
DockStream: A Docking Wrapper to Enhance De Novo Molecular Design
astrazeneca, chemoinformatics, denovo-design, jupyter-notebook, molecular-docking, reinforcement-learning
100 30 Python Apache License 2.0 2023-03-16 07:07:10
150 Bayer-Group/paquo
PAthological QUpath Obsession - QuPath and Python conversations
digital-pathology, python, qupath
100 16 Python GNU General Public License v3.0 2024-06-02 18:21:27
151 genentech/gReLU
gReLU is a python library to train, interpret, and apply deep learning models to DNA sequences.
99 5 Python MIT License 2024-06-07 20:29:13
152 lh3/hickit
TAD calling, phase imputation, 3D modeling and more for diploid single-cell Hi-C (Dip-C) and general Hi-C
bioinformatics, genomics, hi-c
98 11 C 2021-02-04 01:47:43
153 aqlaboratory/rgn2
97 28 Python 2023-11-28 17:16:23
154 lh3/bgt
Flexible genotype query among 30,000+ samples whole-genome
bioinformatics, genomics
96 10 C MIT License 2019-09-04 19:43:27
154 scverse/rapids_singlecell
Rapids_singlecell: A GPU-accelerated tool for scRNA analysis. Offers seamless scverse compatibility for efficient single-cell data processing and analysis.
anndata, bioinformatics, gpu, scverse, single-cell
96 18 Python MIT License 2024-06-03 18:07:06
154 shenwei356/bio_scripts
Practical, reusable scripts for bioinformatics
bioinformatics, perl, python, reusable, script
96 65 Perl MIT License 2019-02-12 13:21:47
155 EBISPOT/OLS
Ontology Lookup Service from SPOT at EBI
java, neo4j, obofoundry, owl, owl-api
95 40 JavaScript Apache License 2.0 2023-04-28 20:09:19
156 Sanofi-Public/CodonBERT
Repository for mRNA Paper and CodonBERT publication.
94 14 Python Other 2024-05-03 19:24:06
156 OpenGene/scrnapip
A Systematic and Dynamic Pipeline for Single-Cell RNA Sequencing Analysis
94 14 HTML 2023-10-16 01:24:06
157 EBI-Metagenomics/genomes-catalogue-pipeline
MGnify genome analysis pipeline
93 21 Python Other 2024-06-06 09:44:21
158 samtools/tabix
Note: tabix and bgzip binaries are now part of the HTSlib project.
92 40 C 2021-08-03 14:29:38
158 shenwei356/BlackheartedHospital (forked from: open-power-workgroup/Hospital)
网传附莆田系医院名单,欢迎更新
92 15 2016-05-03 07:06:09
159 AbSciBio/unlocking-de-novo-antibody-design
91 14 Other 2024-01-09 17:36:19
159 schrodinger/gpusimilarity
A Cuda/Thrust implementation of fingerprint similarity searching
cheminformatics, chemistry, gpu, similarity-analysis
91 26 C++ BSD 3-Clause "New" or "Revised" License 2024-01-24 19:08:08
159 lh3/dipcall
Reference-based variant calling pipeline for a pair of phased haplotype assemblies
91 9 JavaScript MIT License 2021-06-06 20:36:10
160 Bioconductor/CSAMA
Course material for CSAMA: Statistical Data Analysis for Genome Scale Biology
89 45 HTML 2024-06-06 12:04:08
160 AstraZeneca/onto_merger
OntoMerger is an ontology alignment library for deduplicating knowledge graph nodes that represent the same domain.
algorithm, alignment, biological-networks, biology, graph, kg, knowledge, knowledge-graph, mapping, ontology, ontology-alignment
89 5 HTML Apache License 2.0 2024-01-11 19:22:08
160 hoelzer-lab/rnaflow
A simple RNA-Seq differential gene expression pipeline using Nextflow
89 19 HTML GNU General Public License v3.0 2024-02-26 20:45:37
160 shenwei356/perfect-bioinformatic-tools
What should perfect bioinformatic tools be like?
bioinformatics, cli, usability
89 1 Creative Commons Zero v1.0 Universal 2024-03-19 10:22:54
161 Sanofi-IADC/whispr
Open source event, comment and alert processing hub created by Sanofi IADC
88 8 TypeScript MIT License 2024-06-04 12:01:03
161 calico/scBasset
Sequence-based Modeling of single-cell ATAC-seq using Convolutional Neural Networks.
88 11 Jupyter Notebook Apache License 2.0 2024-02-08 19:20:16
161 shenwei356/bio
A lightweight and high-performance bioinformatics package in Golang
bioinformatics, golang, minimizer, package, scaled-minhash, sequence, syncmer, taxdump, taxonomy
88 9 7 Go MIT license 2024-03-11 09:41:44
162 owkin/HE2RNA_code
Train a model to predict gene expression from histology slides.
87 39 Python GNU General Public License v3.0 2022-07-06 20:53:24
162 scverse/pertpy
Perturbation Analysis in the scverse ecosystem.
perturbation, scverse, single-cell
87 19 Python MIT License 2024-06-08 08:07:34

Next page

Footnotes

  1. This page was generated with the topgh open source software on 2024-06-09