GitHub - SFGLab/nf-hichip: Pipeline for processing HiChIP data.

nf-HiChIP Pipeline

Introduction

We have developed an nf-HiChIP pipeline that combines the analytical approach for ChIP-seq data processing (mapping, filtering, peak calling, coverage tracks calculations) with HiChIP-specific analysis (MAPS pipeline, Juric, Ivan, et al.). This pipeline enables users to conduct thorough and efficient analysis of multiple HiChIP datasets simultaneously, eliminating the requirement for additional ChIP-seq experiments. This workflow is based on the reference implementation of the method designed by Zofia Tojek. The original version is available here.

Working with nf-HiChIP pipeline

Step 0. (optional)

You can get familiar with Nextflow options.
-resume flag allows you to execute the pipeline from the last successful step.
For more details, see Nextflow documentation.

Step 1.

Docker image available:

https://hub.docker.com/repository/docker/mateuszchilinski/hichip-nf-pipeline/general

Command to run Docker image (use -v to bind folder with data):

docker run -v /path_to_your_data/:/data_in_container/ -it mateuszchilinski/hichip-nf-pipeline:latest bash

Step 2.

Required Files for Reference Folder (Total 6 files) -

1. Reference fasta files -
    > Homo_sapiens_assembly38.fasta

2. BWA Reference Index files -
    > Homo_sapiens_assembly38.fasta.amb
    > Homo_sapiens_assembly38.fasta.ann
    > Homo_sapiens_assembly38.fasta.bwt
    > Homo_sapiens_assembly38.fasta.pac
    > Homo_sapiens_assembly38.fasta.sa

Step 3.

Example 1 for design.csv file

If you do not have raw and processed results (narrow peaks) from the ChIP-Seq experiment

sample	fastq_1	fastq_2	replicate	chipseq
S1	/data/SAMPLE1_1_R1.fastq.gz	/data/SAMPLE1_1_R2.fastq.gz	1	None
S1	/data/SAMPLE1_2_R1.fastq.gz	/data/SAMPLE1_2_R2.fastq.gz	2	None
S2	/data/SAMPLE2_1_R1.fastq.gz	/data/SAMPLE2_1_R2.fastq.gz	1	None
S2	/data/SAMPLE2_2_R1.fastq.gz	/data/SAMPLE2_2_R2.fastq.gz	2	None

Note -

"None" (note the capital letter) in the last column.
In this case, pseudo-ChIP-Seq data will be generated from HiChIP data.

Example 2 for design.csv file

If you have processed ChIP-Seq experiment results (in the form of narrow peaks)

sample	fastq_1	fastq_2	replicate	chipseq
S1	/data/SAMPLE1_1_R1.fastq.gz	/data/SAMPLE1_1_R2.fastq.gz	1	/data/SAMPLE1.narrowPeak
S1	/data/SAMPLE1_2_R1.fastq.gz	/data/SAMPLE1_2_R2.fastq.gz	2	/data/SAMPLE1.narrowPeak
S2	/data/SAMPLE2_1_R1.fastq.gz	/data/SAMPLE2_1_R2.fastq.gz	1	/data/SAMPLE2.narrowPeak
S2	/data/SAMPLE2_2_R1.fastq.gz	/data/SAMPLE2_2_R2.fastq.gz	2	/data/SAMPLE2.narrowPeak

Note -

Remember, the pipeline requires chromosome names in the "chrX" format (e.g., chr1, chr14, chr21) in the narrowpeak file.
Ensure peak files follow this naming convention and the BED6+4 format.

Example 3 for design.csv file

If you have raw ChIP-Seq data but the peaks have not been called yet

sample	fastq_1	fastq_2	input_1	input_2	replicate
S1	/data/SAMPLE1_1_R1.fastq.gz	/data/SAMPLE1_1_R2.fastq.gz	/data/SAMPLE1_INPUT_R1.fastq.gz	/data/SAMPLE1_INPUT_R2.fastq.gz	1
S1	/data/SAMPLE1_2_R1.fastq.gz	/data/SAMPLE1_2_R2.fastq.gz	/data/SAMPLE1_INPUT_R1.fastq.gz	/data/SAMPLE1_INPUT_R2.fastq.gz	2
S2	/data/SAMPLE2_1_R1.fastq.gz	/data/SAMPLE2_1_R2.fastq.gz	/data/SAMPLE2_INPUT_R1.fastq.gz	/data/SAMPLE2_INPUT_R2.fastq.gz	1
S2	/data/SAMPLE2_2_R1.fastq.gz	/data/SAMPLE2_2_R2.fastq.gz	/data/SAMPLE2_INPUT_R1.fastq.gz	/data/SAMPLE2_INPUT_R2.fastq.gz	2

Step 4.

To run for design file example 1 and example 2, use the main.nf with parameter (use the command inside the container):

/opt/nextflow run main.nf --design design.csv

To run for design file example 3: use the main_chipseq.nf with parameter (use the command inside the container):

/opt/nextflow run main_chipseq.nf --design design.csv

Example

/opt/nextflow run \
       	/mnt/sfglab/nf-hichip/nf-hichip/main.nf \
        --ref /mnt/sfglab/Data/References/Genome/hg38/Homo_sapiens_assembly38/Homo_sapiens_assembly38.fasta  \
        --chrom_sizes /mnt/sfglab/Data/References/Genome/hg38/Homo_sapiens_assembly38/hg38.sizes \
        --outdir /mnt/sfglab/workspaces/output/HiChIP_HG00731 \
        --design /mnt/sfglab/workspaces/design/design_HiChIP_HG00731.csv \
        --threads 4 \
        --mem 10 \
        --mapq 30 \
        --peak_quality 0.01 \

Step 5.

The parameters of the pipeline can be found in the following table. All of them are optional:

Parameter	Description	Default
--ref	Reference genome for the analysis.	/workspaces/hichip-nf-pipeline/ref/Homo_sapiens_assembly38.fasta
--outdir	Folder with the final results.	results
--design	.csv file containing information about samples and replicates.	/workspaces/hichip-nf-pipeline/design_high.csv
--chrom_sizes	Sizes of chromosomes for the specific reference genome.	/workspaces/hichip-nf-pipeline/hg38.chrom.sizes
--threads	Threads are to be used in each task.	4
--mem	Memory to use (in GB) for all samtools tasks (per-sample - e.g., 4 samples with 4 threads with 4GB would consume 64GB of memory).	16
--mapq	MAPQ for MAPS.	30
--peak_quality	Quality parameter (q-value (minimum FDR) cutoff) for MACS3.	0.05
--genome_size	Genome size string for MACS3.	hs

Step 6.

For Post-processing and figure recreation, please follow the scripts in the folder post_processing

Citation

If you use nf-HiChIP in your research (the idea, the algorithm, the analysis scripts, or the supplemental data), please give us a star on the GitHub repo page and cite our paper as follows:

Preprint bioRxiv:

Jodkowska, K., Parteka-Tojek, Z., Agarwal, A., Denkiewicz, M., Korsak, S., Chiliński, M., Banecki, K., & Plewczynski, D. (2024). Improved cohesin HiChIP protocol and bioinformatic analysis for robust detection of chromatin loops and stripes. In bioRxiv (p. 2024.05.16.594268). https://doi.org/10.1101/2024.05.16.594268

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
post_processing		post_processing
tasks		tasks
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
hg38.chrom.sizes		hg38.chrom.sizes
main.nf		main.nf
main_chipseq.nf		main_chipseq.nf
nf_HiChIP_pipeline.png		nf_HiChIP_pipeline.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nf-HiChIP Pipeline

Introduction

Working with nf-HiChIP pipeline

Step 0. (optional)

Step 1.

Step 2.

Step 3.

Step 4.

Step 5.

Step 6.

Citation

About

Releases 11

Packages

Contributors 2

Languages

SFGLab/nf-hichip

Folders and files

Latest commit

History

Repository files navigation

nf-HiChIP Pipeline

Introduction

Working with nf-HiChIP pipeline

Step 0. (optional)

Step 1.

Step 2.

Step 3.

Step 4.

Step 5.

Step 6.

Citation

About

Resources

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 2

Languages

Packages