Skip to content

Latest commit

 

History

History
173 lines (112 loc) · 6.47 KB

README.rst

File metadata and controls

173 lines (112 loc) · 6.47 KB

pipeline_example

A repo to keep tests for drmaa, ruffus and cgat-core.

Scripts and examples are from the relevant packages. For drmaa, only PBSPro has examples.

You can see further examples of pipelines and some installation instructions in cgat-core and project_quickstart.

Requirements

conda install cgatcore

Hopefully that's all you need for Ruffus, cgat-core and python-drmaa. Best to check instructions for each tool however. For DRMAA you'll very likely need to liaise with your system's administrator.

Note that currently conda-forge has "drmaa" v0.7.9 and "python-drmaa" v0.7.6

The folder Docker_and_config_file_examples contains examples of PBSPro user and system wide settings. Note that Dockerfiles will not work per se but have installation instructions for older versions of requirements.

An updated version with a pipeline example is available, follow the cgat-core docs.

Note that once the system DRMAA library is installed, you'll need to set an environment variable:

export DRMAA_LIBRARY_PATH=/<full-path>/libdrmaa.so
# such as:
export DRMAA_LIBRARY_PATH=/usr/local/lib/libdrmaa.so.1

Installation and usage

Create a testing directory and clone from GitHub:

mkdir test_cgat_drmaa
cd test_cgat_drmaa
git clone https://github.com/AntonioJBT/pipeline_example.git

Test whether programs are running as expected for ruffus, ruffus with drmaa, and cgatcore:

# Check ruffus:
python pipeline_example/pipeline_example/ruffus_and_drmaa_tests/ruffus_C1_intro.py

# Check drmaa:
python pipeline_example/pipeline_example/ruffus_and_drmaa_tests/drmaa_status.py
python pipeline_example/pipeline_example/ruffus_and_drmaa_tests/drmaa_example1.py

# Ruffus and drmaa:
#python pipeline_example/pipeline_example/ruffus_and_drmaa_tests/chapter14_ruffus_drmaa.py

# A standard PBSPro qsub script (your system may be different):
qsub pipeline_example/pipeline_example/ruffus_and_drmaa_tests/standard_PBS_qsub.sh
qstat
# Then check the standard out and error files

# Check a cgat-core pipeline:
python pipeline_example/pipeline_example/pipeline_example_minimal.py --help
ln -s pipeline_example/pipeline_example/pipeline.yml .
# (the previous command would usually use the cgat-core config option)
python pipeline_example/pipeline_example/pipeline_example_minimal.py show full
python pipeline_example/pipeline_example/pipeline_example_minimal.py printconfig

# Run locally:
python pipeline_example/pipeline_example/pipeline_example_minimal.py make full --local
# Check the outputs, eg:
cat pipeline.counts
sqlite3 csvdb
# within sqlite3 do eg:
sqlite> .tables # which should print 'pipeline_counts'
sqlite> SELECT * FROM pipeline_counts;
sqlite> .exit

# On the cluster (you need to setup the appropriate configuration for your cluster)
# Clean up previous test:
rm -rf pipeline.log pipeline_example_minimal_counts.load csvdb pipeline_example_minimal.counts
# Run on the cluster, scripts are short but you may still want to nohup it:
nohup python pipeline_example/pipeline_example/pipeline_example_minimal.py make full &
tail -f nohup.out
# Check the outputs

If submitting to a cluster consider using a ~/.cgat.yml file (see an example) for configuration.

Further references and example data for a CGAT pipeline

See first the tutorials for cgat-flow (here) and cgat-core pipeline example.

Some data locations can be found here, for more see cgat-flow's documentation.

Example data

# 201PH are ChIP-seq files from:

ftp://ftp.broad.mit.edu/pub/papers/chipseq/Ku2008/raw/

# Other ChIP-seq files are from here (see this link):

  • SRR446027_1.fastq.gz
  • SRR446027_2.fastq.gz

Further references

cgat-flow, a set of ruffus based pipelines.

Chapter 14: Multiprocessing, drmaa and Computation Clusters — ruffus 2.6.3 documentation

Connecting to a Cluster — Galaxy Project 19.05.dev documentation

DRMAA Wikipedia page

Contribute

Please raise any issues or pull requests in the issue tracker.