Simulation program for evolutionary biologists to study speciation with a complex genotype-phenotype map.
- A C++20 compiler (e.g. GCC or Clang)
- (optional) CMake version 3.16 or higher
- (optional) R and speciomer to read and analyze the data
Here are instructions to build with CMake, but you can compile the source code with the tools of your choice.
(Click here to build as developer.)
(Click here to build on the Peregrine cluster.)
git clone git@github.com:rscherrer/speciome.git
cd speciome
cp CMakeLists_user.txt CMakeLists.txt # user configuration
mkdir build && cd build
cmake ..
cmake --build .
The executable speciome
is built in ../bin/
.
git clone git@github.com:rscherrer/speciome.git
cd speciome
copy CMakeLists_user.txt CMakeLists.txt :: user configuration
mkdir build
cd build
cmake ..
cmake --build . --config Release
The executable speciome.exe
is built in ../bin/
.
Many IDEs support CMake out of the box. "Open folder" should do the trick... You can use CMake to generate the input files for your favorite IDE too:
git clone git@github.com:rscherrer/speciome.git
cd speciome
cp CMakeLists_user.txt CMakeLists.txt # user configuration
mkdir build
cd build
# Generate VisualStudio project files
cmake -G "Visual Studio 17 2022" -A x64 ..
# Generate Xcode project files (Xcode must be installed)
cmake -G Xcode
This will place the project files in ../build
.
Run a simulation with default parameters with:
./speciome
Or provide a parameter file with non-default parameter values:
./speciome parameters.txt
The parameter file must contain parameter names followed by their values, for example:
hsymmetry 1
ecosel 0.6
allfreq 0.2
nvertices 30 30 30
Parameters that are not provided in the parameter file will take default values. Beware that some parameters take multiple values. Click here for a list of all parameters. To allow for replication, if parsave
is set to 1, the parameters used in the simulation (including any automatically-generated seed
) will be saved into a file named paramlog.txt
within the working directory.
This program runs an individual-based simulation where agents live, reproduce and die, through thousands of generations. Individuals have traits that are genetically encoded and can evolve. The model is built in such a way that under the right conditions, the population may split into two phenotypically distinct and reproductively isolated clusters, or species (i.e. speciaiton has happened). Importantly, the genotype codes for the phenotype in a nonlinear way allowing not only additive but also dominance, epistatic and environmental effects. See the accompanying manuscript for more information.
The genetic architecture refers to the constant features of the genotype-phenotype map. Those are features that do not change through time and cannot evolve, including e.g. the number of chromosomes, numbers and positions of loci and topologies and interaction weights of the gene networks. If archload
is set to 0, a new architecture is generated at the beginning of the simulation. Otherwise, the program will read an architecture from a file architecture.txt
that must be present in the working directory. Click here to see what a genetic architecture file should look like. If archsave
is set to 1, the architecture that was used in the simulation (whether generated or provided) will be saved into architecture.txt
.
Set datsave
to 1 to allow data recording. The data are saved every tsave
generations into binary *.dat
files. Click here for a list of the variables that can be saved.
Each variable is saved as a vector of values (64bit double precision floating point numbers). By default the program will save all variables. Set choosewhattosave
to 1 to decide which variables to save instead. The program will then expect a file whattosave.txt
in the working directory. This file should be a list of names of variable to save. For example:
time
EI
SI
RI
locus_Fst
will save time, speciation metrics EI, SI and RI at each time point, and summary statistic Fst for each locus at each time point.
Some variables need other variables to be saved in order to be interpreted down the line. For example, time
must be saved in order for the recorded time points to be appended to the resulting data tables. Or, the population_sizes
in each recorded time point must be known for each individual to be assigned a time point in individual_*
variables and in individual whole genomes.
The data are saved in binary to speed up the writing (and the reading) process. Plus, different users will need to combine the data in many different ways depending on the question they are asking. To read and assemble the data into analyzable datasets, use our R package speciomer. Note that some functions in this package will expect certain files (e.g. paramlog.txt
, architecture.txt
or time.dat
) to be present. In general we advise the following:
- save the generative parameter values (
parsave 1
) - have the genetic architecture at hand (e.g.
archsave 1
) to interpret the genetic data you might save (locus_*
,edge_*
and whole individual genomes) - save
time
, as it is useful information for any of the other variables - save
population_sizes
wheneverindividual_*
variables or whole individual genomes are saved
Saving the whole genomes of all individuals through time takes a lot of space, for this reason this output is controlled separately from the other output variables. Click here for details.
This program is a descendent of ExplicitGenomeSpeciation. Disclaimer: this simulation program was used to get insights into the effect of the genetic architecture on the process of speciation. It was not designed as a statistical inference package or a data processing tool, although its simulations could in theory be used for training machine learning algorithms to recognize various evolutionary scenarios. This code comes with no guarantee whatsoever.
- speciomer: read the simulated data in R
- speciomx: deterministic approximation based on adaptive dynamics theory
- speciome-analyses: analysis scripts and results of the study
- speciome-ms: manuscript
- speciome-private: stuff that needs not be made public
Copyright (c) Raphael Scherrer and G. Sander van Doorn, 2023 (open source license will be added upon publication).