Environmental and genealogical effects on DNA methylation in a widespread apomictic dandelion lineage
This repository reproduces the results presented in "Environmental and genealogical effects on DNA methylation in a widespread apomictic dandelion lineage" by V.N. Ibañez, M. van Antro, C. Peña Ponton, S. Ivanovic, C.A.M. Wagemaker, F. Gawehns, K.J.F. Verhoeven.
DNA methylation that occurs in CG sequence context shows transgenerational stability and high epimutation rate, and can thus provide genealogical information at short time scales. Here, epiGBS2 protocol is used to analyze DNA methylation between accessions from a geographically widespread, apomictic common dandelion (Taraxacum officinale) lineage grown experimentally under different light conditions.
We show that the light treatment induced differentially methylated cytosines (DMCs) in all sequence contexts, with a bias toward transposable elements. Accession differences were associated with DMCs in CG context. Hierarchical clustering of samples based on total mCG profiles revealed a perfect clustering of samples by accession identity, irrespective of light conditions.
Using microsatellite information as a benchmark of genetic divergence within the clonal lineage, we show that genetic divergence between accessions correlates strongly with overall mCG profiles. However, our results suggest that environmental effects that do occur in CG context might produce a heritable signal that partly dilutes the genealogical signal.
Our methodology can be used as a tool for reconstructing micro-evolutionary genealogy, particularly for systems lacking genetic variation, such as clonal and vegetatively propagated plants.
A total of 80 samples were multiplexed together in the same sequence library. Half of them, were digested with the restriction enzymes AseI or Csp6 and NsiI. In this manuscript we present results from the Csp6I - NsiI digested epiGBS samples, as these yielded higher sequencing output than the AseI - NsiI based samples. However, the following scripts can handle both sets of data.
In order to proceed, you will need the following files:
- raw multiplexed read sequences,
- Csp6-NsiI_barcode.tsv,
- Ase6-NsiI_barcode.tsv,
- methylation.bed,
- consensus_cluster.renamed_csp6.fa and
- consensus_cluster.renamed_aseI.fa.
Also, the configuration files used with the epiGBS pipeline are provided.
Raw read data should be processed with epiGBS pipeline following the steps in Preparation to run the pipeline.
To compare, the reports obtained after running the epiGBS pipeline are also provided:
Below there are notebooks demonstrating how data was processing in the article using a subset of the sequenced cytosines.
# | Script | Description | Notebook |
---|---|---|---|
1 | 01_epiTree.R | Generate a filtered methylation file | |
2 | 02_epiTree.R | Characterize overall methylation levels | |
3 | 03_epiTree.R | Generate distances matrices | |
4 | 04_epiTree.R | Obtain the differential Methylated Cytosines with DSS | |
5 | 05_epiTree.R | Obtain Manhattan plots | |
6 | 06_epiTree.R | Obtain dendrograms | |
7 | 07_epiTree.R | Characterize DMC | |
8 | 08_epiTree.R | Mantel test |