-
Notifications
You must be signed in to change notification settings - Fork 1
3. Tutorial (Advanced Settings)
Advanced settings allow you to realize the full potential of ALLEGRO.
-
scorer
or-s
-
The default scorer for ALLEGRO is
'dummy'
which automatically assigns a score of 1.0 to each guide, essentially treating each guide as the same. You may edit this value to'ucrispr'
, which uses the uCRISPR cleavage efficacy predictor developed by Zhang et. al. in:Zhang, Dong, et al. "Unified energetics analysis unravels SpCas9 cleavage activity for optimal gRNA design." Proceedings of the National Academy of Sciences 116.18 (2019): 8693-8698.
Scoring guides may take a long time, which is why we recommend running ALLEGRO in a
tmux
orscreen
session in the background. ALLEGRO will cache the scored guides for you so that their scores do not have to be recalculated in a later experiment. Simply remove the'data/cache/{species_name}.pickle'
file to remove the cached scored guides for that species. Doing so prompts ALLEGRO to recalculate the scores for the guides in that file using uCRISPR. -
Setting the
scorer
to'ucrispr'
requires abeta
value, which we discuss next.
-
beta
or-b
- Integer value. The final size of the gRNAs set must be fewer or equal to this. Think of it as your budget. Beta works best and is a required value (other than 0) when paired with
scorer: 'ucrispr'
. This is because changing the scorer from dummy changes the objective of ALLEGRO from minimizing the set size to maximizing it using the scored guides while keeping the size of the set bound to fewer or equal to beta. - If
scorer
is set to'ucrispr'
, and the value ofbeta
is set to 0 or a too small of a value, ALLEGRO will attempt to find the smallest possiblebeta
for you within the allowed time (set byearly_stopping_patience
, which we discuss later, and given thatenable_solver_diagnostics
is enabled, which it is by default). In short, when you change your scorer from'dummy'
to'ucrispr'
, either specify abeta
yourself, or leave it as 0 so that ALLEGRO can find the smallestbeta
for you and output the scored guides. - The point of
beta
is to allow ALLEGRO to select guides with better cutting efficacy while sacrificing the set size. For example, a certain guide may score 99.0 as determined by uCRISPR, but target only a single gene, while another guide may score 45.0 and cut in 3 genes. If ALLEGRO is restricted by a small beta, it may include the second guide in its output, sacrificing the overall cutting efficacy. If a larger beta is given, ALLEGRO has more freedom to choose the 99.0 scoring guide, in addition to perhaps two more high-scoring guide, sacrificing the overall set size. This tradeoff is the essence of usingbeta
and a guide scorer.
-
patterns_to_exclude
or-pte
- List of strings. ALLEGRO will output guides that do not contain any of the IUPAC patterns in this list. Supports up to 5 chained IUPAC codes; e.g.,
'RYSN'
- Exception to the 5-rule above is when positional nucleotides are used in conjunction with 'N's. For example, entering
'NNNNNNNCNNNNGNNNN'
will exclude guides with G and C in positions 4 and 9 distal to the PAM. - Supports individual nucleotides; e.g.,
'TTTT'
excludes guides with quad-Ts in their sequence (and consequently exclude sequences with more than 4 Ts). - Be careful not to place common nucleotides or IUPAC codes here such as just 'A' or 'AG' as you may end up excluding most or all guides from the calculation.
- As another example, inputting
'WS'
will exclude all guides with an A or a T followed by a G or C.
-
output_offtargets
or-off
-
Boolean
True
/(False
) value. Setting to True directs ALLEGRO to use Bowtie and align the output library against background fasta files. Which files to align the library against is specified byinput_species_offtarget_dir
(-isod
), which should contain the background fasta files to align against, and theinput_species_offtarget_column
(-isoc), which tells ALLEGRO which column in the CSV file provided by
input_species_path(from Basic Settings) contains the names of the files to align against. For example, if your
input_species_path: 'input_species.csv'` looks like the following:species_name filename offtarget_background test_fasta my_test_fasta.fna background.fna then
input_species_offtarget_column
should be set to'offtarget_background'
. You may also align the output library back to the input species fasta file ('my_test_fasta.fna'
) by specifyinginput_species_offtarget_column: 'filename'
(thus not needing a third column at all). -
Enabling this parameter will use
seed_region_is_n_upstream_of_pam
(-seed
) andreport_up_to_n_mismatches
further down in the config.yaml. ALLEGRO will then output a file under your output experiment folder called'targets.csv'
containing guides from the output library that have the same exact seed region upstream of the PAM, but havereport_up_to_n_mismatches
mismatches in the seed-distal region of the target sequence.
-
report_up_to_n_mismatches
or-reportmm
- Integer value in the range [0-3] inclusive and only used if
output_offtargets: True
. This is the'-v'
parameter in Bowtie and cannot go over 3. The mismatches are considered only in the seed-distal region after the firstseed_region_is_n_upstream_of_pam
bases.
-
preclustering
or-prec
-
Boolean
True
/(False
) value and affects running time performance. -
Allows a guide within up to the set number of mismatches (after the seed region) of another guide to "inherit" the second guide's targets, essentially rendering the second guide useless and reducing the total guides needed.
-
Works best when unscored guides are present (
scorer: 'dummy'
) as it does not consider scores. -
Uses
seed_region_is_n_upstream_of_pam
andmismatches_allowed_after_seed_region
parameters. -
Consider the following simple example where in
'my_test_fasta.fna'
we have:>gene1 AAAAGTCTGTATAGAGAAGTTGG >gene2 CAAAGTCTGTATAGAGAAGTTGG >gene3 TAAAGTCTGTATAGAGAAGTTGG
Where the only difference between each sequence is the left-most nucleotide. Using the basic settings of
track: 'track_e'
andmultiplicity: 1
, ALLEGRO will output 3 guides to cover each gene. By turningpreclustering
on, settingseed_region_is_n_upstream_of_pam: 12
, andmismatches_allowed_after_seed_region: 1
, ALLEGRO will output a single guide:AAAAGTCTGTATAGAGAAGT
. We have pre-clustered the 3 guides into 1 as if this single guide targeted all 3 genes.
-
postclustering
or-postc
- Boolean
True
/(False
) value and affects running time performance. After a guide RNA library is generated as output, ALLEGRO will cluster guides in the output library using the other two parametersseed_region_is_n_upstream_of_pam
, andmismatches_allowed_after_seed_region
to add an additional column to the output CSV file called "cluster". Guides in the same cluster mismatch each other after the set seed region according to the set mismatches allowed parameter. - Post-clustering may not generate the same results as pre-clustering because some guides may not even be chosen to be in the final set before they are post-clustered.
-
early_stopping_patience
or-esp
- Integer value, measured in seconds and defaults to 60. Only used in solving the ILP if there are remaining feasible guides with fractional values after solving the LP. ALLEGRO tells OR-Tools to stop searching for an optimal solution after this many seconds.
- Increasing this value may, but does not guarantee, a smaller output set size.
- If a feasible solution is not found within this time frame, ALLEGRO will automatically restart the search with a larger patience (given than
enable_solver_diagnostics
is enabled).
-
enable_solver_diagnostics
or-esd
- Boolean (
True
)/False
value. When a problem is deemed unsolvable (e.g., Status: MPSOLVER_INFEASIBLE), enabling diagnostics will attempt to relax each constraint and resolve the problem. - If the new problem with the relaxed constraint is solvable, ALLEGRO outputs the culprit gene/species.
- Currently, to stop this process, you need to find the PID of the python process running ALLEGRO using:
$ top
and kill it manually:$ kill -SIGKILL PID
#11
Do not hesitate to create a GitHub issue if you read through this documentation and could not find an answer to your question/issue. Click here to go back to the homepage.
Continue to Tutorial (Basic Settings) or Tutorial (Advanced Settings)