author: Kevin Thornton date: Week 5 autosize: true
- Keeping version numbers constant for project A...
- ...while project B to use different versions of stuff...
- ...while letting a third project, C, use the latest version of anything.
In general, this is a very difficult problem.
Broadly-speaking, there are two classes of solutions:
- Package managers
- Containers
conda is the only one that satisfies our requirements by allowing "environments" or "envs".
(Also, homebrew-science is deprecated due to poor quality control.)
- Conda is a package manager
- Aanconda is a distribution of conda.
- Conda cannot be used to replicate environments between OS X and Linux. That is not the intended use.
- Get a distribution from Anaconda here
- The "miniconda3" version is fine. Get the 64-bit version
- (Avoid "miniconda2"--that is Python 2.)
- Follow the instructions
- DO NOT install into your user's home directory.
- Doing so can easily fill your space quota.
- Also, your $HOME is not backed up and you aren't supposed to keep anything important there.
- Install it as a module instead:
# This is the installation for my lab
module load krthornt/anaconda
- The go in
/data/modulefiles/user_contributed_software/$USER
. - They have the following file name format:
/data/modulefiles/user_contributed_software/$USER/program/version
- For example:
#"3" is a file name
/data/modulefiles/user_contributed_software/krthornt/anaconda/3
#%Module1.0
module-whatis "Thornton lab's anaconda3 installation"
exec /bin/logger -p local6.notice -t module-hpc $env(USER) "krthornt/anaconda/3"
set ROOT /data/apps/user_contributed_software/krthornt/anaconda3
prepend-path PATH $ROOT/bin
## for dev/lib installs
prepend-path -d " " LDFLAGS "-L${ROOT}/lib"
prepend-path -d " " CPPFLAGS "-I$ROOT/include"
prepend-path INCLUDE "$ROOT/include"
/data/apps/user_contributed_software/$USER/program
This location is pointed to via the set ROOT
command in the module file.
When installing conda
, you'll get to a prompt like this:
Miniconda2 will now be installed into this location:
/home/molpopgen/miniconda2
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
Simply enter the place to install:
/data/apps/user_contributed_software/$USER/anaconda3
- Different "channels" provide different software
- You'll probably want to know about BioConda
Use:
# Order matters!
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda install bwa
To list your channels:
conda config --get channels
If you use BioConda, cite it! It is maintained by volunteers (like me), who get no credit for doing this unless you cite the paper:
Grüning, Björn, Ryan Dale, Andreas Sjödin, Brad A. Chapman, Jillian Rowe, Christopher H. Tomkins-Tinch, Renan Valieris, the Bioconda Team, and Johannes Köster. 2018. “Bioconda: Sustainable and Comprehensive Software Distribution for the Life Sciences”. Nature Methods, 2018 doi:10.1038/s41592-018-0046-7.
Same goes for R, tidyverse, pandas, numpy, etc.--cite what you use!
- This is where the magic happens!
- Environment have names, and they can contain different versions of software.
- Let's look at the docs for "envs".
- They are like Python "virtualenvs", but they handle C/C++/Java/whatnot packages, too, as well as stuff installed via
pip
.
conda create --name env_name
source activate env_name
# do work
source deactivate
NOTE channels are persistent across environments!
Base it on specific packages/versions:
conda create -n py36_rstudio python==3.6 rstudio
Clone an environment:
conda create --name myclone --clone myenv
List your envs:
conda info --envs
Dump to a YAML
file:
conda env export -n bioconda > bioconda.yml
Create from a YAML
file:
conda env create -n env_name -f file.yml
Use "explicit method":
conda list --explicit -n bioconda > bioconda_explicit.txt
Create from that file:
conda create -n env_name --file bioconda_explict.txt
- We are talking about Linux only here!
- A very different method for development/distribution/packaging of software.
- Start with a light OS image. For example, Ubuntu Linux 18.10.
- Install your software + all its dependencies onto that image.
- Provide a client/server/cloud model for making your images available
- More info here
- Docker is the lower-level infrastructure
- Singularity is a more convenient approach, intended for use on HPC.
- All bioconda packages are turned into Docker containers. (I have not played with this yet.)