A command-line based tool to facilitate the creation of eFISHent single-molecule RNA fluorescence in-situ hybridization (RNA smFISH) oligonucleotide probes.
eFISHent is a tool to facilitate the creation of eFISHent RNA smFISH oligonucleotide probes. Some of the key features of eFISHent are:
- One-line installation using conda (available through bioconda*)
- Automatic gene sequence download from NCBI when providing a gene and species name (or pass a FASTA file)
- Filtering steps to remove low-quality probes including off-targets, frequently occuring short-mers, secondary structures, etc.
- Mathematical or greedy optimization to ensure highest coverage
* The release on bioconda is always associated with waiting times. Therefore, the easiest approach is to install conda dependencies and install eFISHent using pip.
eFISHent is being tested on MacOS and Linux with Python versions 3.8 - 3.10. Unfortunately, due to the bioinformatics dependencies Windows is not supported. For Windows users, we reccommend installing "Windows Subsystem for Linux (WSL)" (Windows 10, Windows 11) or using a fully fledged Virtual Machine. Using conda environment, install eFISHent as follows:
# Create an environment and install all dependencies (e.g. python)
conda env create bbquercus/efishent
# Activate environment
conda activate efishent
# Install efishent via pypi
pip install efishent
Any updates can then simply be done via pypi (pip install --upgrade efishent
).
A detailed usage guide can be found on the GitHub wiki but here is a quick example:
eFISHent --reference-genome <reference-genome> --gene-name <gene> --organism-name <organism>
eFISHent is built up modularly using the following components...
Index creation workflow:
- Bowtie index
- Jellyfish indices
Probe filtering workflow:
- Download / prepare sequences
- Generate candidate probes
- Filter with basic filters
- Align probes to reference genome
- Filter based on alignment score and uniqueness
- Filter reoccuring k-mers
- Filter based on secondary structure prediction
- Create final list of probes
- Write final list of probes to file with report
Probe set analysis plotting:
- Create a simple overview over the key parameters
- Add more detailed documentation as wiki page(s)
- Add links to genomes and RNAseq databases
- Add examples from multiple sources
- Add benchmarks for deltaG, counts
- Add mathematical description for model (in wiki?)
- Add probe set analysis txt file with off-target locations / potentially harmful probes