This R-package contains two functions, one for the chi-squared test and the other for the Cochran-Mantel-Haenszel (CMH) test adapted to drift and pool sequencing variance. The original tests indeed are prone to overdispersion and this is the reason why a correction of the variance is needed when studying data including drift and pool sequencing. The two functions give the possibility to apply each test to the following cases. Either without drift and without pool sequencing noise (classical form of the tests) or with only one of them or with both of them. The output of the function is a vector which contains one p-value (and possibly one value of the test statistic) for each SNP provided in the input data.
Please cite the related publication: K Spitzer, M Pelizzola, A Futschik, Modifying the Chi-square and the CMH test for population genetic inference: adapting to over-dispersion, (2020) https://projecteuclid.org/euclid.aoas/1587002671
After downloading the package, it can be installed using the following command directly in R: install.packages("/Path/To/ACER.tar.gz", repos=NULL, type="source").
NOTE: in order to run some of the examples in the manual the R-package poolSeq and its dependencies are needed. It is possible to find poolSeq and the instruction to install it here
p_values <- adapted.chisq.test(freq=afMat, coverage=covMat, Ne=300, gen=c(0,10), poolSize=rep(1000, ncol(afMat)))
Here the p-values for the case with both pool sequencing noise and drift are computed, when the number of generations ("gen") is a vector of length 1 or a vector with the same value repeated several times then there is no drift and the value of the effective population size ("Ne") is ignored. In order to consider the case without pool sequencing noise set "poolSize = NULL".
For further information and examples refer to the manual typing ?adapted.chisq.test in R.
p_values <- adapted.cmh.test(freq=afMat, coverage=covMat, Ne=rep(300, 3), gen=c(0,10), repl=1:3, poolSize=rep(1000, ncol(afMat)))
The same observations made for the chi-squared test can be done here. The "repl" parameter occurring here refers to the number of replicate populations which can be accounted for by the CMH test.
For further information and examples refer to the manual typing ?adapted.cmh.test in R.
- Marta Pelizzola (marta@math.au.dk)
- Kerstin Gärtner