Parameter file

Parameters file: the snakemake configfile

It is the parameters file that contains optional and non-optional settings to run the pipeline.

Be careful: in a yaml file, the indentation is important.

This file is organized in 2 parts:

1. steps: choose the steps to run

Name	Description	Example	Default value	Possible values
`Steps`	steps to run	["Alignment_countTable_GE","Droplets_QC_GE","Filtering_GE"]	No default value	"Alignment_countTable_GE", "Alignment_countTable_ADT", "Alignment_annotations_TCR_BCR", "Droplets_QC_GE", "Filtering_GE", "Norm_DimRed_Eval_GE", "Clust_Markers_Annot_GE", "Adding_ADT", "Adding_TCR", "Adding_BCR", "Cerebro"
`Tmp`	temporary directory	"/mnt/beegfs/scratch/m_aglave/tmp/"	/tmp	NA

Note: to have more details on steps, see Pipeline details page of the wiki.

2. parameters for each step

Alignment_countTable_GE:

Name	Description	Example	Default value	Possible values
`sample.name.ge`	list of samples names of genes expression	["sample1_GE", "sample2_GE"]	No default value	NA
`input.dir.ge`	absolute path to fastq files of genes expression	"/mnt/beegfs/userdata/m_aglave/fastq/"	No default value	NA
`output.dir.ge`	absolute path to output folder	"/mnt/beegfs/userdata/m_aglave/pipeline/output/"	No default value	NA
`sctech`	technology of 10X used to generate fastq files	"10xv2"	"10xv3"	"10xv2","10xv3"
`kindex.ge`	absolute path to index file for the aligment of genes expression	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/KB-python_KALLISTO/0.24.4_0.46.2/homo_sapiens/GRCh38/Ensembl/r99/cDNA_LINCs_MIRs/GRCH38_r99_cDNA_linc_mir.kidx"	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/KB-python_KALLISTO/0.24.4_0.46.2/homo_sapiens/GRCh38/Ensembl/r99/cDNA_LINCs_MIRs/GRCH38_r99_cDNA_linc_mir.kidx"	NA
`tr2g.file.ge`	absolute path to tr2g file for the aligment of genes expression	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/KB-python_KALLISTO/0.24.4_0.46.2/homo_sapiens/GRCh38/Ensembl/r99/cDNA_LINCs_MIRs/GRCH38_r99_cDNA_linc_mir_tr2gs.txt"	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/KB-python_KALLISTO/0.24.4_0.46.2/homo_sapiens/GRCh38/Ensembl/r99/cDNA_LINCs_MIRs/GRCH38_r99_cDNA_linc_mir_tr2gs.txt"	NA
`reference.txt`	text for the aligment of genes expression in Materials and Methods	"Ensembl reference transcriptome v99 corresponding to the homo sapiens GRCH38 build"	"<insert_you_reference_here>"	NA
`fastqscreen_index`	absolute path to the configuration file of references for fastq-screen alignment	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/FASTQ_SCREEN/0.14.0/fastq_screen.conf"	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/FASTQ_SCREEN/0.14.0/fastq_screen.conf"	NA

Droplets_QC_GE:

Name	Description	Example	Default value	Possible values
`sample.name.ge`	list of samples names of genes expression	["sample1_GE", "sample2_GE"]	determined from `sample.name.ge` of `Alignment_countTable_GE` if it exists	NA
`input.dir.ge`	absolute path to the aligment results folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/KALLISTOBUS/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/KALLISTOBUS/"]	determined from `output.dir.ge` of `Alignment_countTable_GE` if it exists	NA
`output.dir.ge`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/"]	determined from `output.dir.ge` of `Alignment_countTable_GE` if it exists	NA
`species`	species of genes expression	"homo_sapiens"	"homo_sapiens"	"homo_sapiens","mus_musculus"
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`emptydrops.fdr`	FDR threshold for emptydrops tool	"5E-02"	"1E-03"	NA
`droplets.limit`	number min of droplets to run emptydrops	"1E+04"	"1E+05"	NA
`emptydrops.retain`	all droplets with a number of UMI above this value is considered as a cell	1000	No default value	NA
`translation`	bool to translate ENSG into Gene Symbol	TRUE	TRUE	TRUE/FALSE
`pcmito.min`	threshold min for percentage of mitochondrial RNA (below this threshold the cells are eliminated)	0	0	NA
`pcmito.max`	threshold max for percentage of mitochondrial RNA (above this threshold the cells are eliminated)	0.1	0.2	NA
`pcribo.min`	threshold min for percentage of ribosomal RNA (below this threshold the cells are eliminated)	0.1	0	NA
`pcribo.max`	threshold max for percentage of ribosomal RNA (above this threshold the cells are eliminated)	0.9	1	NA
`min.features`	threshold min for number of genes (below this threshold the cells are eliminated)	150	200	NA
`min.counts`	threshold min for number of UMI (below this threshold the cells are eliminated)	1500	1000	NA
`min.cells`	include genes expressed in at least this many cells (minimum cells covering)	10	5	NA
`mt.genes.file`	RDS file with list of mitochondrial genes	"/mnt/beegfs/pipelines/single-cell/resources/GENELISTS/homo_sapiens_mito_symbols_20191001.rds"	determined from `species` parameter of `Droplets_QC_GE`	NA
`crb.genes.file`	RDS file with list of ribosomal genes	"/mnt/beegfs/pipelines/single-cell/resources/GENELISTS/homo_sapiens_cribo_symbols_20191015.rds"	determined from `species` parameter of `Droplets_QC_GE`	NA
`str.genes.file`	RDS file with list of mecanic stress genes	"/mnt/beegfs/pipelines/single-cell/resources/GENELISTS/homo_sapiens_stress_symbols_20200224.rds"	determined from `species` parameter of `Droplets_QC_GE`	NA
`translation.file`	file of translation between ENSG into Gene Symbol	"/mnt/beegfs/pipelines/single-cell/resources/GENE_CONVERT/EnsemblToGeneSymbol_Homo_sapiens.GRCh38.txt"	determined from `species` parameter of `Droplets_QC_GE`	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Filtering_GE:

Name	Description	Example	Default value	Possible values
`sample.name.ge`	list of samples names	["sample1_GE", "sample2_GE"]	determined from `sample.name.ge` of `Droplets_QC_GE` if it exists	NA
`input.rda.ge`	absolute path to the file.rda containing the seurat R object	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/QC_droplets/sample1_GE_QC_NON-NORMALIZED.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/QC_droplets/sample2_GE_QC_NON-NORMALIZED.rda"]	determined from `output.dir.ge` of `Droplets_QC_GE` if it exists	NA
`output.dir.ge`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/"]	determined from `output.dir.ge` of `Droplets_QC_GE` if it exists	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`pcmito.min`	threshold min for percentage of mitochondrial RNA (below this threshold the cells are eliminated)	0	0	NA
`pcmito.max`	threshold max for percentage of mitochondrial RNA (above this threshold the cells are eliminated)	0.1	0.2	NA
`pcribo.min`	threshold min for percentage of ribosomal RNA (below this threshold the cells are eliminated)	0.1	0	NA
`pcribo.max`	threshold max for percentage of ribosomal RNA (above this threshold the cells are eliminated)	0.9	1	NA
`min.features`	threshold min for number of genes (below this threshold the cells are eliminated)	150	200	NA
`min.counts`	threshold min for number of UMI (below this threshold the cells are eliminated)	1500	1000	NA
`min.cells`	include genes expressed in at least this many cells (minimum cells covering)	10	5	NA
`doublets.filter.method`	method used to filter doublets. To not filter set this parameter to "none"	"all"	"all"	"all","scDblFinder","scds","none"
`cc.seurat.file`	RDS file with list of cell cycle genes for seurat	"/mnt/beegfs/pipelines/single-cell/resources/GENELISTS/homo_sapiens_cyclone_pairs_symbols_20191001.rds"	determined from `species` into seurat object	NA
`cc.cyclone.file`	RDS file with list of cell cycle genes for cyclone	"/mnt/beegfs/pipelines/single-cell/resources/GENELISTS/homo_sapiens_cyclone_pairs_symbols_20191001.rds"	determined from `species` into seurat object	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Norm_DimRed_Eval_GE:

Name	Description	Example	Default value	Possible values
`sample.name.ge`	list of samples names	["sample1_GE", "sample2_GE"]	determined from `sample.name.ge` of `Filtering_GE` if it exists	NA
`input.rda.ge`	absolute path to the file.rda containing the seurat R object	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/sample1_GE_FILTERED_NON-NORMALIZED.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/sample2_GE_FILTERED_NON-NORMALIZED.rda"]	determined from `output.dir.ge` of `Filtering_GE` if it exists	NA
`output.dir.ge`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/"]	determined from `output.dir.ge` of `Filtering_GE` if it exists	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`eval.markers`	list of genes to evaluate normalization and dimension reduction	"GAPDH,CD4,CD8A,CD24,CTLA4"	No default value	NA
`features.n`	number of High Variable Genes to consider	3000	2000	NA
`norm.method`	name of normalization method	"LogNormalize"	"SCTransform"	"LogNormalize","SCTransform"
`dimred.method`	name of dimension reduction method	"scbfa"	"pca"	"scbfa","bpca","pca","mds"
`vtr.biases`	list of biases to regress	"nFeature_RNA,percent_mt"	No default value	percent_mt, percent_rb, nFeature_RNA, percent_st, Cyclone.Phase, and all other column name in metadata
`vtr.scale`	bool to center biaises to regress (for scbfa and bpca only)	FALSE	FALSE	TRUE,FALSE
`dims.max`	number max of dimensions to compute	100	50	NA
`dims.min`	number min of dimensions to compute	10	3	NA
`dims.steps`	steps for dimensions to compute for evaluation	3	2	NA
`res.max`	number max of resolution to compute for evaluation	3	1.2	NA
`res.min`	number min of resolution to compute for evaluation	0.1	0.1	NA
`res.steps`	steps for resolution to compute for evaluation	0.2	0.1	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Clust_Markers_Annot_GE:

Name	Description	Example	Default value	Possible values
`sample.name.ge`	list of samples names	["sample1_GE", "sample2_GE"]	determined from `sample.name.ge` of `Norm_DimRed_Eval_GE` if it exists	NA
`input.rda.ge`	absolute path to the normalized and reduced seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/sample1_GE_SCTransform_pca.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/sample2_GE_SCTransform_pca.rda"]	determined from `output.dir.ge` of `Norm_DimRed_Eval_GE` if it exists	NA
`output.dir.ge`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/"]	determined from `output.dir.ge` of `Norm_DimRed_Eval_GE` if it exists	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`markfile`	genes to plot on umap	"/mnt/beegfs/userdata/m_aglave/pipeline/markers1.xslx,/mnt/beegfs/userdata/m_aglave/pipeline/markers2.xslx"	No default value	see Additional files of Configuration in this wiki
`keep.dims`	number of dimension to keep for clustering (from 0 to keep.dims)	25	No default value	NA
`keep.res`	resolution value for clustering	0.5	No default value	NA
`cfr.minscore`	minimum correlation score for clustifyr to consider	0.40	0.35	NA
`sr.minscore`	minimum correlation score for SingleR to consider	0.20	0.25	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Cerebro:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format) to convert in cerebro object	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample1_GE_SCTransform_pca_25_0.5.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample2_GE_SCTransform_pca_25_0.5.rda"]	determined from seurat object output of `Adding_BCR` or `Adding_TCR` or `Adding_ADT` or `Clust_Markers_Annot_GE`	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`version`	version of cerebro to use	"v1.2"	"v1.3"	"v1.2","v1.3"
`groups`	column name (in meta.data) to define clusters/comparisons for Cerebro object (usefull for TCR/BCR part).	"conditions"	the last RNA clustering and samples information are already included	all column name in metadata
`remove.other.reductions`	remove all other reductions present in seurat object (keep only final umap)	FALSE	FALSE	TRUE,FALSE
`remove.other.idents`	remove all other clustering present in seurat object (keep only the last clustering)	TRUE	FALSE	TRUE,FALSE
`remove.mt.genes`	remove mitochondrial genes (to see better the other genes)	FALSE	FALSE	TRUE,FALSE
`remove.crb.genes`	remove ribosomal genes (to see better the other genes)	FALSE	FALSE	TRUE,FALSE
`remove.str.genes`	remove stress genes (to see better the other genes)	FALSE	FALSE	TRUE,FALSE
`only.pos.DE`	keep only positive DE genes from customized differential expression analysis (for genes markers identification is always only positive)	FALSE	FALSE	TRUE,FALSE
`remove.custom.DE`	remove results from customized differential expression analysis	FALSE	FALSE	TRUE,FALSE
`gmt.file`	GMT file for cerebro	"/mnt/beegfs/pipelines/single-cell/resources/DATABASE/MSIGDB/7.1/msigdb_v7.1_GMTs/msigdb.v7.1.symbols.gmt"	"/mnt/beegfs/pipelines/single-cell/resources/DATABASE/MSIGDB/7.1/msigdb_v7.1_GMTs/msigdb.v7.1.symbols.gmt"	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Alignment_countTable_ADT:

Name	Description	Example	Default value	Possible values
`sample.name.adt`	list of samples names of cell surface proteins	["sample1_ADT", "sample2_ADT"]	No default value	NA
`input.dir.adt`	absolute path to cell surface proteins fastq files	"/mnt/beegfs/userdata/m_aglave/fastq/"	No default value	NA
`output.dir.adt`	absolute path to output folder	"/mnt/beegfs/userdata/m_aglave/pipeline/output/"	No default value	NA
`sctech`	technology of 10X used to generate fastq files	"10xv2"	"10xv3"	"10xv2","10xv3"
`kindex.adt`	absolute path to index file for aligment	"/mnt/beegfs/userdata/m_aglave/ADT/kallisto_index/project_CITEseq_kallisto_index"	No default value	NA
`tr2g.file.adt`	absolute path to tr2g file for aligment	"/mnt/beegfs/userdata/m_aglave/ADT/kallisto_index/project_CITEseq_tr2gs.txt"	No default value	NA

Adding_ADT:

Name	Description	Example	Default value	Possible values
`input.rda.ge`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample1_GE_SCTransform_pca_25_0.5.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample2_GE_SCTransform_pca_25_0.5.rda"]	determined from seurat object output of `Clust_Markers_Annot_GE`	NA
`sample.name.adt`	list of samples names of cell surface proteins	["sample1_ADT","sample2_ADT"]	determined from `sample.name.adt` of `Alignment_countTable_ADT`	NA
`input.dir.adt`	absolute path to the aligment results folder of cell surface proteins	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_ADT/KALLISTOBUS/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_ADT/KALLISTOBUS/"]	determined from `output.dir.adt` of `Alignment_countTable_ADT`	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`gene.names`	list of gene names wich correspond to the ADT proteins	"CD3G,CD4,CTLA4"	No default value	NA
`ADT.max.cutoff`	list of quantile max to cutoff protein expression for plot	"q70,q95,q85"	"q95" * number of gene in "gene.names" parameter	NA
`ADT.min.cutoff`	list of quantile min to cutoff protein expression for plot	"q30,q30,q55"	"q30" * number of gene in "gene.names" parameter	NA

Alignment_annotations_TCR_BCR:

Name	Description	Example	Default value	Possible values
`sample.name.tcr`	list of samples names of TCR	["sample1_TCR", "sample2_TCR"]	No default value	NA
`input.dir.tcr`	absolute path to TCR fastq files	"/mnt/beegfs/userdata/m_aglave/fastq/"	No default value	NA
`sample.name.bcr`	list of samples names of BCR	["sample1_BCR", "sample2_BCR"]	No default value	NA
`input.dir.bcr`	absolute path to BCR fastq files	"/mnt/beegfs/userdata/m_aglave/fastq/"	No default value	NA
`output.dir.tcr_bcr`	absolute path to output folder	"/mnt/beegfs/userdata/m_aglave/pipeline/"	No default value	NA
`crindex.tcr_bcr`	CellRanger index for vdj analysis	"/mnt/beegfs/database/bioinfo/single-cell/TCR_REFERENCES/refdata-cellranger-vdj-GRCh38-alts-ensembl-3.1.0"	"/mnt/beegfs/database/bioinfo/single-cell/TCR_REFERENCES/refdata-cellranger-vdj-GRCh38-alts-ensembl-3.1.0"	NA
`fastqscreen_index`	absolute path to the configuration file of references for fastq-screen alignment	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/FASTQ_SCREEN/0.14.0/fastq_screen.conf"	"/mnt/beegfs/database/bioinfo/single-cell/INDEX/FASTQ_SCREEN/0.14.0/fastq_screen.conf"	NA

Adding_TCR:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample1_GE_SCTransform_pca_25_0.5_ADT.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample2_GE_SCTransform_pca_25_0.5_ADT.rda"]	determined from seurat object output of `Adding_ADT` or `Clust_Markers_Annot_GE`	NA
`vdj.input.file.tcr`	file filtered_contig_annotations.csv from CellRanger aligment pipeline	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_TCR/sample1_TCR_CellRanger/outs/filtered_contig_annotations.csv","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_TCR/sample2_TCR_CellRanger/outs/filtered_contig_annotations.csv"]	determined from `output.dir.tcr_bcr` of Alignment_annotations_TCR_BCR	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA

Adding_BCR:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample1_GE_SCTransform_pca_25_0.5_ADT_TCR.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims25_res0.5/sample2_GE_SCTransform_pca_25_0.5_ADT_TCR.rda"]	determined from seurat object output of `Adding_TCR` or `Adding_ADT` or `Clust_Markers_Annot_GE`	NA
`vdj.input.file.bcr`	file filtered_contig_annotations.csv from CellRanger aligment pipeline	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_BCR/sample1_BCR_CellRanger/outs/filtered_contig_annotations.csv","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_BCR/sample2_BCR_CellRanger/outs/filtered_contig_annotations.csv"]	determined from `output.dir.tcr_bcr` of `Alignment_annotations_TCR_BCR`	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA

Int_Norm_DimRed_Eval_GE:

Name	Description	Example	Default value	Possible values
`name.int`	list of samples integration names	["samples1_and_2_int_Seurat", "sample1_and_3_int_Seurat"]	No default value	NA
`input.list.rda`	absolute path to the files.rda containing the seurat R objects	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims35_res1.2/sample1_GE_SCTransform_pca_35_1.2_ADT_TCR_BCR.rda,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims33_res0.4/sample2_GE_SCTransform_pca_33_0.4_ADT_TCR_BCR.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims35_res1.2/sample1_GE_SCTransform_pca_35_1.2_ADT_TCR_BCR.rda,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/SCTransform/pca/dims33_res0.4/sample3_GE_SCTransform_pca_33_0.4_ADT_TCR_BCR.rda"]	No default value	NA
`output.dir.int`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/","/mnt/beegfs/userdata/m_aglave/pipeline/output/"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`eval.markers`	list of genes to evaluate normalization and dimension reduction	"GAPDH,CD4,CD8A,CD24,CTLA4"	No default value	NA
`min.cells`	minimum number of cells that samples must have to be included in the analysis	1000	0	NA
`integration.method`	name of integration method	"Seurat"	No default value	"Seurat","scbfa","Harmony","Liger"
`vtr.batch`	list of batch biases to correct thanks to the integration	"orig.ident"	No default value	all column names in metadata; no need with `Seurat` integration
`features.n`	number of High Variable Genes to consider	3000	2000	NA
`norm.method`	name of normalization method	"LogNormalize"	"SCTransform"	"LogNormalize","SCTransform" or NULL with `Seurat` integration
`dimred.method`	name of dimension reduction method	"scbfa"	"pca"	"scbfa","bpca","pca","mds"
`vtr.biases`	list of biases to regress	"nFeature_RNA,percent_mt"	No default value	percent_mt, percent_rb, nFeature_RNA, percent_st, Cyclone.Phase, and all other column names in metadata
`vtr.scale`	bool to center biaises to regress (for scbfa and bpca only)	FALSE	FALSE	TRUE,FALSE
`dims.max`	number max of dimensions to compute	100	50	NA
`dims.min`	number min of dimensions to compute	10	3	NA
`dims.steps`	steps for dimensions to compute for evaluation	3	2	NA
`res.max`	number max of resolution to compute for evaluation	3	1.2	NA
`res.min`	number min of resolution to compute for evaluation	0.1	0.1	NA
`res.steps`	steps for resolution to compute for evaluation	0.2	0.1	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Int_Clust_Markers_Annot_GE:

Name	Description	Example	Default value	Possible values
`name.int`	list of samples integration names	["samples1_and_2_int_Seurat", "sample1_and_3_int_Seurat"]	determined from `name.int` of `Int_Norm_DimRed_Eval_GE` if it exists	NA
`input.rda.int`	absolute path to the integrated, normalized and reduced seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/samples1_and_2_int_Seurat_SCTransform_pca.rda,"/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_3_int_Seurat/SCTransform/pca/samples1_and_3_int_Seurat_SCTransform_pca.rda"]	determined from seurat object output of `Int_Norm_DimRed_Eval_GE` if it exists	NA
`output.dir.int`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_3_int_Seurat/SCTransform/pca/"]	determined from `output.dir.ge` of `Int_Norm_DimRed_Eval_GE` if it exists	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`markfile`	genes to plot on umap	"/mnt/beegfs/userdata/m_aglave/pipeline/markers1.xslx,/mnt/beegfs/userdata/m_aglave/pipeline/markers2.xslx"	No default value	see Additional files of Configuration in this wiki
`keep.dims`	number of dimension to keep for clustering (from 0 to keep.dims)	25	No default value	NA
`keep.res`	resolution value for clustering	0.5	No default value	NA
`cfr.minscore`	minimum correlation score for clustifyr to consider	0.40	0.35	NA
`sr.minscore`	minimum correlation score for SingleR to consider	0.20	0.25	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Int_Adding_ADT:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/dims25_res0.5/samples1_and_2_int_Seurat_SCTransform_pca_25_0.5.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/dims25_res0.5/samples1_and_3_int_Seurat_SCTransform_pca_25_0.5.rda"]	determined from seurat object output of `Int_Clust_Markers_Annot_GE`	NA
`samples.name.adt`	list of samples names of cell surface proteins	["sample1_ADT,sample2_ADT","sample1_ADT,sample3_ADT"]	No default value	NA
`input.dirs.adt`	absolute path to the aligment results folder of cell surface proteins	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_ADT/KALLISTOBUS/,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_ADT/KALLISTOBUS/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_ADT/KALLISTOBUS/,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_ADT/KALLISTOBUS/"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`gene.names`	list of gene names wich correspond to the ADT proteins	"CD3G,CD4,CTLA4"	No default value	NA
`ADT.max.cutoff`	list of quantile max to cutoff protein expression for plot	"q70,q95,q85"	"q95" * number of gene in "gene.names" parameter	NA
`ADT.min.cutoff`	list of quantile min to cutoff protein expression for plot	"q30,q30,q55"	"q30" * number of gene in "gene.names" parameter	NA

Int_Adding_TCR:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/dims25_res0.5/samples1_and_2_int_Seurat_SCTransform_pca_25_0.5_ADT.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/dims25_res0.5/samples1_and_3_int_Seurat_SCTransform_pca_25_0.5_ADT.rda"]	determined from seurat object output of `Int_Adding_ADT` or `Int_Clust_Markers_Annot_GE`	NA
`vdj.input.files.tcr`	file filtered_contig_annotations.csv from CellRanger aligment pipeline	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_TCR/sample1_TCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_TCR/sample2_TCR_CellRanger/outs/filtered_contig_annotations.csv","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_TCR/sample1_TCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_TCR/sample3_TCR_CellRanger/outs/filtered_contig_annotations.csv"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA

Int_Adding_BCR:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/dims25_res0.5/samples1_and_2_int_Seurat_SCTransform_pca_25_0.5_ADT_TCR.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/INTEGRATED/samples1_and_2_int_Seurat/SCTransform/pca/dims25_res0.5/samples1_and_3_int_Seurat_SCTransform_pca_25_0.5_ADT_TCR.rda"]	determined from seurat object output of `Int_Adding_TCR` or `Int_Adding_ADT` or `Int_Clust_Markers_Annot_GE`	NA
`vdj.input.files.bcr`	file filtered_contig_annotations.csv from CellRanger aligment pipeline	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_BCR/sample1_BCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_BCR/sample2_BCR_CellRanger/outs/filtered_contig_annotations.csv","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_BCR/sample1_BCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_BCR/sample3_BCR_CellRanger/outs/filtered_contig_annotations.csv"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA

Grp_Norm_DimRed_Eval_GE:

Name	Description	Example	Default value	Possible values
`name.grp`	list of samples integration names	["samples1_and_2_grp_keep", "sample1_and_3_grp_keep"]	No default value	NA
`input.list.rda`	absolute path to the files.rda containing the seurat R objects	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims35_res1.2/sample1_GE_SCTransform_pca_35_1.2_ADT_TCR_BCR.rda,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims33_res0.4/sample2_GE_SCTransform_pca_33_0.4_ADT_TCR_BCR.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims35_res1.2/sample1_GE_SCTransform_pca_35_1.2_ADT_TCR_BCR.rda,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_GE/F200_C1000_M0-0.2_R0-1_G5/DOUBLETSFILTER_all/NORMKEPT/pca/dims33_res0.4/sample3_GE_NORMKEPT_pca_33_0.4_ADT_TCR_BCR.rda"]	No default value	NA
`output.dir.grp`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/","/mnt/beegfs/userdata/m_aglave/pipeline/output/"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`eval.markers`	list of genes to evaluate normalization and dimension reduction	"GAPDH,CD4,CD8A,CD24,CTLA4"	No default value	NA
`min.cells`	minimum number of cells that samples must have to be included in the analysis	1000	0	NA
`keep.norm`	individual normalization must be kept or not	FALSE	FALSE	TRUE,FALSE
`features.n`	number of High Variable Genes to consider	3000	2000	NA
`norm.method`	name of normalization method	"LogNormalize"	"SCTransform" if `keep.norm` is set to FALSE else NULL	"LogNormalize","SCTransform",NULL
`dimred.method`	name of dimension reduction method	"scbfa"	"pca"	"scbfa","bpca","pca","mds"
`vtr.biases`	list of biases to regress	"nFeature_RNA,percent_mt"	No default value	percent_mt, percent_rb, nFeature_RNA, percent_st, Cyclone.Phase, and all other column names in metadata
`vtr.scale`	bool to center biaises to regress (for scbfa and bpca only)	FALSE	FALSE	TRUE,FALSE
`dims.max`	number max of dimensions to compute	100	50	NA
`dims.min`	number min of dimensions to compute	10	3	NA
`dims.steps`	steps for dimensions to compute for evaluation	3	2	NA
`res.max`	number max of resolution to compute for evaluation	3	1.2	NA
`res.min`	number min of resolution to compute for evaluation	0.1	0.1	NA
`res.steps`	steps for resolution to compute for evaluation	0.2	0.1	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Grp_Clust_Markers_Annot_GE:

Name	Description	Example	Default value	Possible values
`name.grp`	list of samples integration names	["samples1_and_2_grp_keep", "sample1_and_3_grp_keep"]	determined from `name.grp` of `Grp_Norm_DimRed_Eval_GE` if it exists	NA
`input.rda.grp`	absolute path to the grouped, normalized and reduced seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/samples1_and_2_grp_keep_NORMKEPT_pca.rda,"/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_3_grp_keep/NORMKEPT/pca/samples1_and_3_grp_keep_NORMKEPT_pca.rda"]	determined from seurat object output of `Grp_Norm_DimRed_Eval_GE` if it exists	NA
`output.dir.grp`	absolute path to output folder	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_3_grp_keep/NORMKEPT/pca/"]	determined from `output.dir.ge` of `Grp_Norm_DimRed_Eval_GE` if it exists	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`markfile`	genes to plot on umap	"/mnt/beegfs/userdata/m_aglave/pipeline/markers1.xslx,/mnt/beegfs/userdata/m_aglave/pipeline/markers2.xslx"	No default value	see Additional files of Configuration in this wiki
`keep.dims`	number of dimension to keep for clustering (from 0 to keep.dims)	25	No default value	NA
`keep.res`	resolution value for clustering	0.5	No default value	NA
`cfr.minscore`	minimum correlation score for clustifyr to consider	0.40	0.35	NA
`sr.minscore`	minimum correlation score for SingleR to consider	0.20	0.25	NA
`metadata.file`	csv file with the metadata to add in the seurat object	"/mnt/beegfs/userdata/m_aglave/pipeline/meta1.csv,/mnt/beegfs/userdata/m_aglave/pipeline/meta2.csv"	No default value	NA

Grp_Adding_ADT:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/dims25_res0.5/samples1_and_2_grp_keep_NORMKEPT_pca_25_0.5.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/dims25_res0.5/samples1_and_3_grp_keep_NORMKEPT_pca_25_0.5.rda"]	determined from seurat object output of `Grp_Clust_Markers_Annot_GE`	NA
`samples.name.adt`	list of samples names of cell surface proteins	["sample1_ADT,sample2_ADT","sample1_ADT,sample3_ADT"]	No default value	NA
`input.dirs.adt`	absolute path to the aligment results folder of cell surface proteins	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_ADT/KALLISTOBUS/,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_ADT/KALLISTOBUS/","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_ADT/KALLISTOBUS/,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_ADT/KALLISTOBUS/"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA
`gene.names`	list of gene names wich correspond to the ADT proteins	"CD3G,CD4,CTLA4"	No default value	NA
`ADT.max.cutoff`	list of quantile max to cutoff protein expression for plot	"q70,q95,q85"	"q95" * number of gene in "gene.names" parameter	NA
`ADT.min.cutoff`	list of quantile min to cutoff protein expression for plot	"q30,q30,q55"	"q30" * number of gene in "gene.names" parameter	NA

Grp_Adding_TCR:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/dims25_res0.5/samples1_and_2_grp_keep_NORMKEPT_pca_25_0.5_ADT.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/dims25_res0.5/samples1_and_3_grp_keep_NORMKEPT_pca_25_0.5_ADT.rda"]	determined from seurat object output of `Grp_Adding_ADT` or `Grp_Clust_Markers_Annot_GE`	NA
`vdj.input.files.tcr`	file filtered_contig_annotations.csv from CellRanger aligment pipeline	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_TCR/sample1_TCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_TCR/sample2_TCR_CellRanger/outs/filtered_contig_annotations.csv","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_TCR/sample1_TCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_TCR/sample3_TCR_CellRanger/outs/filtered_contig_annotations.csv"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA

Grp_Adding_BCR:

Name	Description	Example	Default value	Possible values
`input.rda`	absolute path to the seurat object (in .rda format)	["/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/dims25_res0.5/samples1_and_2_grp_keep_NORMKEPT_pca_25_0.5_ADT_TCR.rda","/mnt/beegfs/userdata/m_aglave/pipeline/output/GROUPED_ANALYSIS/NO_INTEGRATED/samples1_and_2_grp_keep/NORMKEPT/pca/dims25_res0.5/samples1_and_3_grp_keep_NORMKEPT_pca_25_0.5_ADT_TCR.rda"]	determined from seurat object output of `Grp_Adding_TCR` or `Grp_Adding_ADT` or `Grp_Clust_Markers_Annot_GE`	NA
`vdj.input.files.bcr`	file filtered_contig_annotations.csv from CellRanger aligment pipeline	["/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_BCR/sample1_BCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample2_BCR/sample2_BCR_CellRanger/outs/filtered_contig_annotations.csv","/mnt/beegfs/userdata/m_aglave/pipeline/output/sample1_BCR/sample1_BCR_CellRanger/outs/filtered_contig_annotations.csv,/mnt/beegfs/userdata/m_aglave/pipeline/output/sample3_BCR/sample3_BCR_CellRanger/outs/filtered_contig_annotations.csv"]	No default value	NA
`author.name`	name of the analysis author	"marine_aglave"	No default value	NA
`author.mail`	mail of the analysis author	"monmail@gustaveroussy.fr"	No default value	NA

Notes:

If there is not default value, the parameter is obligatory.
sctech is a common parameter for GE and ADT: 10xv2 if you have TCR or BCR files too, else 10xv3.
str.genes.file parameter of Droplets_QC_GE step corresponds to a list of mecanic stress genes from the thesis of Léo Machado entitled « From skeletal muscle stem cells to tissue atlases: new tools to investigate and circumvent dissociation-induced stress », 2019.
If you use SCTransform and scbfa the vtr.biases will be used in both methods (reminder: scbfa uses non-normalized uncorrected counts).
The dimension and resolution parameters depend on sample complexity and number of cells.
The index for adt alignment must be specific of antibodies used. It can be made thanks to kb-python tool usable by conda.
The list of genes name in gene.names parameter, of Adding_ADT step, must be in the same ordre than proteins name in the index, to keep the correspondance. Same for Int_Adding_ADT and Grp_Adding_ADT steps.
The Droplets_QC_GE, Filtering_GE, Norm_DimRed_Eval_GE, Clust_Markers_Annot_GE, Adding_ADT, Adding_TCR, Adding_BCR and Cerebro steps are proceded in this order. The input/output parameters are automatically detemined thanks to the step before. For exemple, if there is no Adding_ADT step, parameters of Adding_TCR step will be determined thanks to the Clust_Markers_Annot_GE step; if there are no Adding_ADT, Adding_TCR and Adding_BCR steps, parameters of Cerebro step will be determined thanks to the Clust_Markers_Annot_GE step. Same thing for Int_Norm_DimRed_Eval_GE, Int_Clust_Markers_Annot_GE, Int_Adding_ADT, Int_Adding_TCR, Int_Adding_BCR and Cerebro, and for Grp_Norm_DimRed_Eval_GE, Grp_Clust_Markers_Annot_GE, Grp_Adding_ADT, Grp_Adding_TCR, Grp_Adding_BCR and Cerebro.
You can comment some parameters in your file with # if you want to save it but not to use it.
For integration by Seurat or for integration by Liger or for grouped analysis with keep.norm set to TRUE, norm.method must be set to NULL because in these 3 cases the individual normalizations are kept. If you set a value other than NULL, the script will automatically correct by NULL.
For integration by Seurat, batch.vtr must be set to NULL because this option allows to specify the batch effect(s) to correct for integrations by Liger, scbfa or Harmony. If you set a value other than NULL, the script will automatically correct by NULL.
For integration by Harmony, common normalization by SCTransform and dimension reduction by pca are advised by the authors of the method, but you can test other normalization and dimension reduction methods if you wish.
For integration by Liger, the dimension reduction parameter must be set to Liger, otherwise it will be automatically corrected in Liger. Likewise for scbfa.
The list of sample names and files in Adding_ADT,Adding_TCR,Adding_BCR step, must be in the same ordre than sample in GE steps, to keep the correspondance. Same for Int_Adding_TCR,Int_Adding_BCR,Grp_Adding_TCR,Grp_Adding_BCR steps, and the order of samples _GE.
The seurat objects used as input for interated/grouped analysis can be those generated at the end of the Norm_DimRed_Eval_GE step for an integration by Seurat or for integration by Liger or for grouped analysis with keep.norm set to TRUE; and even those generated at the end of the Filtering_GE step for other cases (not need to keep normalization).
It is possible to make a single parameter file which combines an integrated analysis and a grouped analysis. The Cerebro step will automatically identify the output seurat (.rda) objects to use for both cases.

Example of main parameters file:

steps: ["Alignment_countTable_GE"]

Alignment_countTable_GE:
  sample.name.ge : ["0732M_GE"]
  input.dir.ge : '/mnt/beegfs/userdata/m_aglave/fastq/'
  output.dir.ge : '/mnt/beegfs/userdata/m_aglave/pipeline/'
  sctech : '10xv2'
  kindex.ge : '/mnt/beegfs/database/bioinfo/single-cell/INDEX/KB-python_KALLISTO/0.24.4_0.46.2/homo_sapiens/GRCh38/Ensembl/r99/cDNA_LINCs_MIRs/GRCH38_r99_cDNA_linc_mir.kidx'
  tr2g.file.ge : '/mnt/beegfs/database/bioinfo/single-cell/INDEX/KB-python_KALLISTO/0.24.4_0.46.2/homo_sapiens/GRCh38/Ensembl/r99/cDNA_LINCs_MIRs/GRCH38_r99_cDNA_linc_mir_tr2gs.txt'
  reference.txt: 'Ensembl reference transcriptome v99 corresponding to the homo sapiens GRCH38 build'

Home

Resources of the Theory of single cell RNA-seq

v1.3

Pipeline details

Installation

Usage

Configuration

Results help

Complete Examples of school cases

Individual analysis :
1 sample (scRNA-seq + ADT + TCR + BCR)

Grouped/Integrated analysis :
2 samples (scRNA-seq + ADT + TCR + BCR)

The datasets
Preparation of the analysis
- Make the ADT reference index
- Make the Markfile
General information
Make the integrated analysis
- Integration, Normalization, Dimension Reduction, Biases and Clustering Evaluation
- Clustering, Marker Genes, Annotation, ADT, TCR, BCR and Cerebro
Make the grouped analysis
- Merge, Normalization, Dimension Reduction, Biases and Clustering Evaluation
- Clustering, Marker Genes, Annotation, ADT, TCR, BCR and Cerebro

Provide feedback

Saved searches

Use saved searches to filter your results more quickly