diff --git a/docs/yaml_docs/pipeline_integration_ym.md b/docs/yaml_docs/pipeline_integration_yml.md similarity index 71% rename from docs/yaml_docs/pipeline_integration_ym.md rename to docs/yaml_docs/pipeline_integration_yml.md index a7575f69..dfbaa610 100644 --- a/docs/yaml_docs/pipeline_integration_ym.md +++ b/docs/yaml_docs/pipeline_integration_yml.md @@ -25,6 +25,7 @@ You can download the different integration pipeline.yml files here: Computing resources to use, specifically the number of threads used for parallel jobs. Specified by the following parameters: + - threads_high `Integer`, Default: 1
Number of threads used for high intensity computing tasks. For each thread, there must be enough memory to load your MuData object which was created in the preprocessing step of @@ -77,7 +78,7 @@ Prefix for the sample that comes out of the filtering/ preprocessing steps of th The column you want to batch correct on, if a comma-separated list is specified then all will be used simultaneously - ### Harmony arguments +### Harmony arguments - harmony: Basic parameters required to run harmony: @@ -144,7 +145,7 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information The method can either be scanpy or hnsw -## Protein +## Protein modality prot: Batch correction for the protein modality is specified by the following parameters: @@ -160,7 +161,7 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information The column you want to batch correct on, if a comma-separated list is specified then all will be used simultaneously - ### Harmony arguments +### Harmony arguments - harmony: Basic parameters required to run harmony: @@ -170,7 +171,9 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information - npcs `Integer`, Default: 30
For more information on harmony check https://portals.broadinstitute.org/harmony/reference/RunHarmony.html + ### BBKNN arguments + Check https://bbknn.readthedocs.io/en/latest/ for more information - bbknn: - neighbors_within_batch: `Integer`, Default: 3
@@ -191,7 +194,7 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information The method can either be scanpy or hnsw -## atac +## ATAC modality atac: Batch correction for the protein modality is specified by the following parameters: @@ -202,15 +205,15 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information - dimred `String`, Default: PCA
Defines if you which dimensionality reduction to use, PCA or LSI - - tools `String` (comma-separated), Default: (CHEEEEECKKKK)
+ - tools `String` (comma-separated), Default: harmony
Defines the method used to run batch correction, multiple can be selected. - choices: harmony, bbknn, combat + choices: harmony, bbknn - column `String` (comma-separated), Default: sample_id
The column you want to batch correct on, if a comma-separated list is specified then all will be used simultaneously - ### Harmony arguments +### Harmony arguments - harmony: Basic parameters required to run harmony: @@ -220,12 +223,18 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information - npcs `Integer`, Default: 30
For more information on harmony check https://portals.broadinstitute.org/harmony/reference/RunHarmony.html + ### BBKNN arguments + Check https://bbknn.readthedocs.io/en/latest/ for more information + - bbknn: + - neighbors_within_batch: `Integer`, Default: 3
+ ### Find neighbour parameters + - neighbors: `String`, Default: &atac_neighbors
- npcs `Integer`, Default: 30
@@ -254,113 +263,130 @@ Check https://bbknn.readthedocs.io/en/latest/ for more information This is the column you want to run a batch correction on, multiple can be selected simultaneously. Extra parameters: - - totalvi: + +### TotalVI arguments + + **totalvi has to run on both rna and protein data** + These are the basic totalvi parameters required, you can add more if it fits your analysis better. - - modalities `String`(Comma separated), Default: rna,prot
- totalvi has to run on both rna and protein data - + - totalvi: + + - modalities `String`(Comma separated), Default: rna,prot
- exclude_mt_genes `Boolean`, Default: True
- - mr_column `String`, Default: mt
+ - mt_column `String`, Default: mt
- filter_by_hvg `Boolean`, Default: True
+ To filter manually create a column called prot_outliers in mdata['prot'] - - filter_prot_outliers `Boolean`, Default: False
+ - filter_prot_outliers `Boolean`, Default: False
- model_args: - latent_distribution`String`, Default: "normal"
- - training_args: + - training_args: - max_epochs`Integer`, Default: 100
- train_size`Float`, Default: 0.9
- early_stopping `Boolean`, Default: True
- - training_args `String`, Default: Nonw
+ - training_plan `String`, Default: None
+### MultiVI arguments + **totalvi has to run on both rna and atac data** - - MultiVI: - These are the basic MultiVI parameters required, you can add more if it fits your analysis better. Leave arguments blank for default + These are the basic multivi parameters required, you can add more if it fits your analysis better. + + By setting lowmen to True it will subset the atac to the top 25k HVF which is recommended to deal with concatenation of atac,rna on large datasets which at the moment is suboptimally required by scvitool. Note that >100GB of RAM are required to concatenate atac,rna with 15k cells and 120k total features (union rna,atac) -By setting lowmen to True it will subset the atac to the top 25k HVF which is recommended to deal with concatenation of atac,rna on large datasets which at the moment is suboptimally required by scvitool. Note that >100GB of RAM are required to concatenate atac,rna with 15k cells and 120k total features (union rna,atac). + - MultiVI: - - lowmen `Boolean`, Default: True
- - - model_args `String`, Default: None
- - n_hidden `String`, Default: None
- - n_latent `Boolean`, Default: True
- - region_factors `Boolean`, Default: True
- - latent_distribution `String`, Default: normal
- - deeply_inject_covariates `Boolean`, Default: False
- - fully_paired `Boolean`, Default: False
- - training_args - - max_epochs `Integer`, Default: 500
- - lr `Float`, Default: 0.0001
- - use_gpu `String`, Default: None
+ - lowmen `Boolean`, Default: True
+ - model_args `String`, Default: None
+ - n_hidden `String`, Default: None
+ - n_latent `Boolean`, Default: True
+ - region_factors `Boolean`, Default: True
+ - latent_distribution `String`, Default: normal
+ - deeply_inject_covariates `Boolean`, Default: False
+ - fully_paired `Boolean`, Default: False
+ - training_args + - max_epochs `Integer`, Default: 500
+ - lr `Float`, Default: 0.0001
+ - use_gpu `String`, Default: None
Leave blank for default str, int and bool. - - train_size `Float`, Default: 0.9
- - validation_size `String`, Default: None
+ - train_size `Float`, Default: 0.9
+ - validation_size `String`, Default: None
Leave blank for default - - batch_size `Integer`, Default: 128
- - weight_decay `Float`, Default: 0.001
- - eps `Float`, Default: 1e-08
- - early_stopping `Boolean`, Default: True
- - save_best `Boolean`, Default: True
- - check_val_every_n_epoch `String`, Default: None
+ - batch_size `Integer`, Default: 128
+ - weight_decay `Float`, Default: 0.001
+ - eps `Float`, Default: 1e-08
+ - early_stopping `Boolean`, Default: True
+ - save_best `Boolean`, Default: True
+ - check_val_every_n_epoch `String`, Default: None
Leave blank for the default integer - - n_steps_kl_warmup `String`, Default: None
+ - n_steps_kl_warmup `String`, Default: None
Leave blank for the default integer - - n_epochs_kl_warmup `Integer`, Default: 50
- - adversarial_mixing `Boolean`, Default: True
- - training_plan `String, Default: None
- - mofa: These are the basic mofa parameters required, you can add more if it fits your analysis better. - - modalities `String` (Comma separated), Default: rna, prot, atac
- - fliter_by_hgv `Boolean`, Default: True
- - n_factors `Integer`, Default: 10
- - n_iterations `Integer`, Default: 1000
- - convergence_mode `String`, Default: fast
- Choice between fast, medium, and slow - - save_parameters `Boolean`, Default: False
- - outfile `String`, Default: path/to/h5ad/to_save_model_to
(CHHHHECCCKKKKKKKKK) + - n_epochs_kl_warmup `Integer`, Default: 50
+ - adversarial_mixing `Boolean`, Default: True
+ - training_plan `String`, Default: None
+ + +### Mofa + +**Requires at least two modalities, however can run with all three** + + These are the basic mofa parameters required, you can add more if it fits your analysis better. + +- mofa: + - modalities `String` (Comma separated), Default: rna,prot,atac
+ - fliter_by_hgv `Boolean`, Default: True
+ - n_factors `Integer`, Default: 10
+ - n_iterations `Integer`, Default: 1000
+ - convergence_mode `String`, Default: fast
+ Choice between fast, medium, and slow + - save_parameters `Boolean`, Default: False
+ - outfile `String`, Default: `path/to/h5ad/to_save_model_to`
+### WNN - - WNN: - These are the basic WNN parameters required, you can add more if it fits your analysis better. +**Requires at least two modalities, however can run with all three** - - modalities `String` (Comma separated), Default: rna, prot, atac
- - batch_corrected `String`, Default: None
-Set the modality to one method ("bbknn", "scVI", "harmony", "scanorama"), if left None, a default de novo calculation of neighbours on non-corrected data for that modality using specified parameters - - rna `String`, Default: None
+ These are the basic WNN parameters required, you can add more if it fits your analysis better. + +- WNN: + - modalities `String` (Comma separated), Default: rna, prot, atac
+ - batch_corrected `String`, Default: None
+ + Set the modality to one method ("bbknn", "scVI", "harmony", "scanorama"), if left None, a default de novo calculation of neighbours on non-corrected data for that modality using specified parameters + - rna `String`, Default: None
Options here include "bbknn" and "harmony" - - prot `String`, Default: None
+ - prot `String`, Default: None
Options here include "harmony" - - atac `String`, Default: None
+ - atac `String`, Default: None
- - knn: - - rna `String`, Default: *rna_neighbors
- - prot `String`, Default: *prot_neighbors
- - atac `String`, Default: *atac_neighbors
+ - knn: + - rna `String`, Default: *rna_neighbors
+ - prot `String`, Default: *prot_neighbors
+ - atac `String`, Default: *atac_neighbors
- - - n_neighbors `String`, Default: "leave blank"
+ - n_neighbors `String`, Default: "leave blank"
Leave blank to arithmetic mean across modalities neighbors - - n_bandwidth_neighbors `Integer`, Default: 20
+ - n_bandwidth_neighbors `Integer`, Default: 20
- - n_multineighbors `Integer`, Default: 200
+ - n_multineighbors `Integer`, Default: 200
- - metric `String`, Default: euclidean
+ - metric `String`, Default: euclidean
- - low_memory `Boolean`, Default: True
+ - low_memory `Boolean`, Default: True
- neighbors: - npcs `Integer`, Default: 30
-The number of principal components to calculate for neighbors and umap. If no correction is applied PCA will be calculated and used to run the UMAP. If harmony is chosen it will use the following components to create a corrected dimensionality reduction - + The number of principal components to calculate for neighbors and umap. If no correction is applied PCA will be calculated and used to run the UMAP. If harmony is chosen it will use the following components to create a corrected dimensionality reduction - k `Integer`, Default: 30
- metric `String`, Default: euclidean
Options include euclidean and cosine @@ -380,14 +406,18 @@ Grouping must be a categorical variable - rna `String`, Default: rna:total_counts
- prot `String`, Default: prot:total_counts
- - atac (CHHEECCKKKKKK) + - atac `String`, Default: atac:total_counts
- multimodal `String`, Default: rna:total_counts
### Make final object -Leave this final option blank until you have reviewed the results from running `papipes integration make full`. This step will produce a mudata object with one layer and one correction per modality, and one multimodal layer. For unimodal integration select the uncorrected version and use "no_correction" then run `panpipes integration make merge_integration`. +Leave this final option blank until you have reviewed the results from running `papipes integration make full`. + +This step will produce a mudata object with one layer and one correction per modality, and one multimodal layer. For unimodal integration select the uncorrected version and use "no_correction". + +**Then run**`panpipes integration make merge_integration` - final_obj: - rna: @@ -404,59 +434,3 @@ Leave this final option blank until you have reviewed the results from running ` - include `Boolean`, Default: True
- bc_choice `String`, Default: totalvi
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -