-
Notifications
You must be signed in to change notification settings - Fork 1
Home
# Downloading ALLEGRO
git clone https://github.com/AmirUCR/allegro.git
cd allegro
# Installing Dependencies
conda create -n allegro python=3.10 numpy=1.26.0 pandas=2.1.1 pyyaml=6.0.1 biopython=1.78 bioconda::bowtie=1.0.0
conda activate allegro
pip install scikit-learn==1.3.2
pip install Cython==3.0.5
# Checking ALLEGRO
$ python src/main.py --soundcheck
ALLEGRO comes with 50 out of the 2059 species used in its paper as example input. You may find the manifest file under data/input/fifty_example_input_species.csv and the fasta files under data/input/example_input. Note that these fasta files contain the orthologous genes for LYS2, MET17, TRP1, URA3, FCY1, GAP1, and CAN1 in S. cerevisiae S288C as determined by DIAMOND. Additionally, these fasta files have been modified to delimit intron/exon boundaries using the respective GFF files.
To conduct an experiment using the default settings in your config.yaml file, simply execute the following:
python src/main.py
ALLEGRO will output the smallest gRNA library to target every record/gene in the 50 input files and place your library under data/output/ALLEGRO_EXAMPLE_RUN/ALLEGRO_EXAMPLE_RUN_library.txt
. You may modify the name of the folder and experiment in the config file.
Using the default configuration file, you will find 5 files under the output directory of your experiment. 4 of these files begin with the name of your experiment, and 1 is the solver's log. Let's go over each of these:
- ALLEGRO_EXAMPLE_RUN_config_used.txt
- A copy of the config.yaml file used to conduct this experiment
- ALLEGRO_EXAMPLE_RUN_library.txt
- Contains your Cas9 guide RNAs library without the PAM
- ALLEGRO_EXAMPLE_RUN.csv
- Contains a detailed report about the guide RNAs in the library including their sequences with PAM, target files, target reference names, strands, positions, efficiency scores, and the target file paths
- ALLEGRO_EXAMPLE_RUN.txt
- Reports how many targets each guide in the library cuts
- solver_log.txt
- Reports the total number of Cas9 guides discovered, the total number of genes/references (if track E) or the total number of input files/species (if track A), cut multiplicity, beta, and the LP and ILP non-zero values for each guide
Continue to Tutorial (Basic Settings) or Tutorial (Advanced Settings)