This repository contains the code base for my master thesis. This work was done in Kornfeld Group at the Max Planck Institute for Biological Intelligence in collaboration with the Lab for AI in Medicine at TU Munich. The thesis can be accessed here.
Connectomics is the study of synaptic wiring diagrams and is a very fast growing research area in the field of neuroscience. While sophisticated volume electron microscopy techniques have enabled the acquisition and visualization of cells and tissues on the nanometer scale, the processing tools do not capture all the high-resolution details present in the acquired electron microscopy (EM) data. One of the steps in processing EM data is cell type prediction. Since the manual labeling of cell types is a tedious and inherently biased task, deep learning methods are used to learn from the morphology (or shape) of neurons and infer their cell types. The existing cell type prediction methods mostly rely on the morphology of neurons and discard their connectivity properties. Here, our main contribution is to propose the usage of graph neural networks (GNN) to learn representations of neurons by considering the morphology and synaptic connectivity jointly. Using self-supervised learning, we generate node embeddings which can then be used for downstream tasks such as cell type predictions. The methodology is two-fold. We first use the labeled data and develop GNN methods for semi-supervised cell type prediction. Then, we extend these methods to generate representations of cells in a self-supervised setting. We investigate the methods in the two learning paradigms and empirically evaluate them. We show that for some specific cell types, the connectivity improves the prediction while for others, it either hurts the prediction or results in a similar performance compared to the methods that only use the morphology.
data
consists of a script to create a graph dataset in pytorch-geometric stylegraph
consists of two scripts,compute_features
.py andgenerate_graph.py
to create node data using morphology of the cell and edge data using the synaptic connectivitymodels
consists of a file with all the models and layers used in the experimentsnotebooks
is currently empty (I will push the notebooks used in the early days of the thesis for experiments)scripts
consists of helper scripts to compute path lengths, perform node clustering, etc.tests
consists of unit-test modules for testing graph data generation (sanity check)