Skip to content

Commit

Permalink
fix spaces
Browse files Browse the repository at this point in the history
  • Loading branch information
SarahOuologuem committed Feb 19, 2024
1 parent f1bd6ef commit b81fe5a
Showing 1 changed file with 24 additions and 19 deletions.
43 changes: 24 additions & 19 deletions docs/yaml_docs/spatial_preprocess.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ Specified by the following three parameters:

With the preprocess_spatial workflow, one or multiple `MuData` objects can be preprocessed in one run. The workflow **reads in all `.h5mu` objects of a directory**. The `MuData` objects in the directory need to be of the same assay (vizgen or visium). The workflow then runs the preprocessing of each `MuData` object separately with the same parameters that are specified in the yaml file.
<br>

<span class="parameter">input_dir</span> `String`, Mandatory parameter<br>
Path to the folder containing all input `h5mu` files.

Expand All @@ -63,6 +64,8 @@ With the preprocess_spatial workflow, one or multiple `MuData` objects can be pr
- <span class="parameter">keep_barcodes</span> `String`, Default: None<br>
Path to a csv-file that has **no header** containing barcodes you want to keep. Barcodes that are not in the file, will be removed from the dataset before filtering the dataset with the thresholds specified below.
<br>


With the parameters below you can specify thresholds for filtering. The filtering is fully customisable to any columns in `.obs` or `.var`. You are not restricted by the columns given as default. When specifying a column name, please make sure it exactly matches the column name in the h5mu object. <br> Please slso make sure, that the specified metrics are present in all `h5mu` objects of the `input_dir`, i.e. the `MuData` objects for that the preprocessing is run.


Expand All @@ -86,6 +89,7 @@ With the parameters below you can specify thresholds for filtering. The filterin

The parameters below specify which metrics of the filtered data to plot. As for the [QC](./spatial_qc.md), violin and spatial embedding plots are generated for each slide separately.
<br>

<span class="parameter">plotqc</span><br>
- <span class="parameter">grouping_var</span> `String`, Default: None<br>
Comma-separated string without spaces, e.g. _sample_id,batch_ of categorical columns in `.obs`. One violin will be created for each group in the violin plot. Not mandatory, can be left empty.
Expand All @@ -97,48 +101,49 @@ The parameters below specify which metrics of the filtered data to plot. As for
## 4. Normalization, HVG Selection, and PCA Options

### **4.1 Normalization and HVG Selection** <br>

`Panpipes` offers two different normalization and HVG selection flavours, `'seurat'` and `'squidpy'`. <br> The `'seurat'` flavour first selects HVGs on the raw counts using analytic Pearson residuals, i.e. [scanpy.experimental.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.experimental.pp.highly_variable_genes.html). Afterwards, analytic Pearson residual normalization is applied, i.e. [scanpy.experimental.pp.normalize_pearson_residuals](https://scanpy.readthedocs.io/en/stable/generated/scanpy.experimental.pp.normalize_pearson_residuals.html). Parameters of both functions can be specified by the user in the yaml file. <br>The `'squidpy'` flavour runs the basic scanpy normalization and HVG selection functions, i.e. [scanpy.pp.normalize_total](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.normalize_total.html), [scanpy.pp.log1p](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.log1p.html), and [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).
<br>

<span class="parameter">norm_hvg_flavour</span>[`'squidpy'`, `'seurat'`], Default: None<br>
Normalization and HVG selection flavour to use. If None, will not run normalization nor HVG selection.
<br>

___Parameters for `norm_hvg_flavour` == `'squidpy'`___ <br>
- <span class="parameter">squidpy_hvg_flavour</span>[`'seurat'`,`'cellranger'`,`'seurat_v3'`], Default: 'seurat'<br>
Flavour to select HVGs, i.e.`flavor` parameter of the function [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).
<span class="parameter">squidpy_hvg_flavour</span>[`'seurat'`,`'cellranger'`,`'seurat_v3'`], Default: 'seurat'<br>
Flavour to select HVGs, i.e.`flavor` parameter of the function [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).

- <span class="parameter">min_mean</span>`Float`, Default: 0.05<br>
Parameter in [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).
<span class="parameter">min_mean</span>`Float`, Default: 0.05<br>
Parameter in [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).

- <span class="parameter">max_mean</span>`Float`, Default: 1.5<br>
Parameter in [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).
<span class="parameter">max_mean</span>`Float`, Default: 1.5<br>
Parameter in [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).

- <span class="parameter">min_disp</span>`Float`, Default: 0.5<br>
Parameter in [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).
<span class="parameter">min_disp</span>`Float`, Default: 0.5<br>
Parameter in [scanpy.pp.highly_variable_genes](https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html).

___Parameters for `norm_hvg_flavour` == `'seurat'`___ <br>
- <span class="parameter">theta</span>`Float`, Default: 100<br>
The negative binomial overdispersion parameter for pearson residuals. The same value is used for [HVG selection]((https://scanpy.readthedocs.io/en/stable/generated/scanpy.experimental.pp.highly_variable_genes.html)) and [normalization](https://scanpy.readthedocs.io/en/stable/generated/scanpy.experimental.pp.normalize_pearson_residuals.html).
<span class="parameter">theta</span>`Float`, Default: 100<br>
The negative binomial overdispersion parameter for pearson residuals. The same value is used for [HVG selection]((https://scanpy.readthedocs.io/en/stable/generated/scanpy.experimental.pp.highly_variable_genes.html)) and [normalization](https://scanpy.readthedocs.io/en/stable/generated/scanpy.experimental.pp.normalize_pearson_residuals.html).

- <span class="parameter">clip</span>`Float`, Default: None<br>
Specifies clipping of the residuals. <br>`clip` can be specified as: <br> <ul><li> <u>None</u>: residuals are clipped to the interval [-sqrt(n_obs), sqrt(n_obs)] </li><li><u>A float value</u>: if float c specified: clipped to the interval [-c, c]</li> <li> <u>np.Inf</u>: no clipping</li></ul>
<span class="parameter">clip</span>`Float`, Default: None<br>
Specifies clipping of the residuals. <br>`clip` can be specified as: <br> <ul><li> <u>None</u>: residuals are clipped to the interval [-sqrt(n_obs), sqrt(n_obs)] </li><li><u>A float value</u>: if float c specified: clipped to the interval [-c, c]</li> <li> <u>np.Inf</u>: no clipping</li></ul>

___Parameters for both `norm_hvg_flavour` flavours___ <br>
- <span class="parameter">n_top_genes</span>`Integer`, Default: 2000<br>
Number of genes to select. Mandatory for `norm_hvg_flavour='seurat'` and `squidpy_hvg_flavour='seurat_v3'`.
<span class="parameter">n_top_genes</span>`Integer`, Default: 2000<br>
Number of genes to select. Mandatory for `norm_hvg_flavour='seurat'` and `squidpy_hvg_flavour='seurat_v3'`.

- <span class="parameter">filter_by_hvg</span>`Boolean`, Default: False<br>
Subset the data to the HVGs.
<span class="parameter">filter_by_hvg</span>`Boolean`, Default: False<br>
Subset the data to the HVGs.

- <span class="parameter">hvg_batch_key</span>`String`, Default: None<br>
If specified, HVGs are selected within each batch separately and merged.
<span class="parameter">hvg_batch_key</span>`String`, Default: None<br>
If specified, HVGs are selected within each batch separately and merged.


### **4.2 PCA**

After normalization and HVG selection, PCA is run and the PCA and elbow plot are plotted. For that, the user can specify the number of PCs for the PCA computation and for the elbow plot, i.e. the same number is used for both.
<br>

<span class="parameter">n_pcs</span>`Integer`, Default: 50<br>
Number of PCs to compute.

0 comments on commit b81fe5a

Please sign in to comment.