Skip to content

A repository of pipelines for single-cell data in Nextflow DSL2. Here updated to make SCENIC single- and multi-run functional with modern v2 feather files

License

Notifications You must be signed in to change notification settings

HowieJM/vsn-pipelines

 
 

Repository files navigation

VSN-Pipelines

## 2024-09-03

# Fork Notes:

I noticed that the repository has been archived. Unfortunately, the most recent version does not run with the most up-to-date motif files. Therefore, I produced a fork that can run the scenic module of this VSN-pipeline in both single-run and multi-run modes. To do this, I borrowed two fixes from the ccasar/vsn-pipelines fork. These allow the VSN-pipeline to run in single-run mode if skipReports = true in the config. To allow the multi-run aggregation to function, I made one further tweak. These are small changes but can be tricky to ID. Hopefully they will save time for people who want to use SCENIC multi-run mode with aggregation. I've noted key setup options.

---

# Run Notes:

To run, produce an environment and install the following:

  • Singularity: 3.8.6
  • Nextflow: 21.04.03 (crucial)

Then export these variables, checking before and after:

locale
export LANG="C"
export LC_ALL="C"
locale

After this, pull the fork:

nextflow pull HowieJM/vsn-pipelines -r master
ls -l ~/.nextflow/assets/HowieJM/vsn-pipelines

Make the Config file:

nextflow config HowieJM/vsn-pipelines \
   -profile scenic,scenic_multiruns,scenic_use_cistarget_motifs,scenic_use_cistarget_tracks,hg38,singularity > nf_CPUopt-Real-MultiRun.config

Then edit the config:

container = 'aertslab/pyscenic_scanpy:0.12.0_1.9.1'  #crucial note -> you can run with 0.12.1_1.9.1 but in Linux this can lead to low multicore rates, using 0.12.0 allows full use
skipReports = true   #crucial, for up-to-date feather files for the motifs/tracks

Run the pipeline:

nextflow -C nf_CPUopt-Real-MultiRun.config run HowieJM/vsn-pipelines -entry scenic -r master

You can run the VSN-pipeline implementation of pySCENIC in single-run mode or in multiple-run with aggregation.

# Further Notes:

If skipReports=false the run will fail. To re-introduce these reports would require at least edits to vsn-pipelines/src/scenic/bin/reports/scenic_report.ipynb. I have not looked at this. But, if interested, have a look at ccasar/vsn-pipelines fork for one attempt to fix this. Here, we simply set to skipReports=true to hot fix.

# JMH, Sept, 3rd, 2024 [+edits 30 Sept 2024]

##

VSN-Pipelines has now been archived

2023-04-19 - Unfortunately due to lack of developers, VSN-pipelines is no longer being worked on and has been archived. The repo will remain in read-only mode from this point on.

A repository of pipelines for single-cell data analysis in Nextflow DSL2.

GitHub release (latest by date) Documentation Status Zenodo Gitter Nextflow

Full documentation is available on Read the Docs, or take a look at the Quick Start guide.

This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into subfolders within the src/ directory. The VIB-Singlecell-NF organization contains this main repo along with a collection of example runs (VSN-Pipelines-examples). Currently available workflows are listed below.

If VSN-Pipelines is useful for your research, consider citing:

Raw Data Processing Workflows

These are set up to run Cell Ranger and DropSeq pipelines.

Raw Data Processing Workflows
Pipeline / Entrypoint Purpose Documentation
cellranger Process 10x Chromium data cellranger
demuxlet_freemuxlet Demultiplexing demuxlet_freemuxlet
nemesh Process Drop-seq data nemesh

Single Sample Workflows

The Single Sample Workflows perform a "best practices" scRNA-seq analysis. Multiple samples can be run in parallel, treating each sample separately.

Single Sample Workflows
Pipeline / Entrypoint Purpose Documentation
single_sample Independent samples Single-sample Pipeline
single_sample_scenic Ind. samples + SCENIC Single-sample SCENIC Pipeline
scenic SCENIC GRN inference SCENIC Pipeline
scenic_multiruns SCENIC run multiple times SCENIC Multi-runs Pipeline
single_sample_scenic_multiruns Ind. samples + multi-SCENIC Single-sample SCENIC Multi-runs Pipeline
single_sample_scrublet Ind. samples + Scrublet Single-sample Scrublet Pipeline
decontx DecontX DecontX Pipeline
single_sample_decontx Ind. samples + DecontX Single-sample DecontX Pipeline
single_sample_decontx_scrublet Ind. samples + DecontX + Scrublet Single-sample DecontX Scrublet Pipeline

Sample Aggregation Workflows

Sample Aggregation Workflows: perform a "best practices" scRNA-seq analysis on a merged and batch-corrected group of samples. Available batch correction methods include BBKNN, mnnCorrect, and Harmony.

Sample Aggregation Pipelines
Pipeline / Entrypoint Purpose Documentation
bbknn Sample aggregation + BBKNN BBKNN Pipeline
bbknn_scenic BBKNN + SCENIC BBKNN SCENIC Pipeline
harmony Sample aggregation + Harmony Harmony Pipeline
harmony_scenic Harmony + SCENIC Harmony SCENIC Pipeline
mnncorrect Sample aggregation + mnnCorrect MNN-correct Pipeline

In addition, the pySCENIC implementation of the SCENIC workflow is integrated here and can be run in conjunction with any of the above workflows. The output of each of the main workflows is a loom-format file, which is ready for import into the interactive single-cell web visualization tool SCope. In addition, data is also output in h5ad format, and reports are generated for the major pipeline steps.

scATAC-seq workflows

Single cell ATAC-seq processing steps are now included in VSN Pipelines. Currently, a preprocesing workflow is available, which will take fastq inputs, apply barcode correction, read trimming, bwa mapping, and output bam and fragments files for further downstream analysis. See here for complete documentation.

About

A repository of pipelines for single-cell data in Nextflow DSL2. Here updated to make SCENIC single- and multi-run functional with modern v2 feather files

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 51.2%
  • Python 25.9%
  • Jupyter Notebook 13.2%
  • R 6.7%
  • Dockerfile 2.8%
  • Shell 0.2%