This repository contains the files for replicating the experiments described in the papers:
- Córdoba I., Varando G., Bielza C., Larrañaga P. A partial orthogonalization method for simulating covariance and concentration graph matrices. Proceedings of Machine Learning Research (PGM 2018), vol 72, pp. 61-72, 2018.
- Córdoba I., Varando G., Bielza C., Larrañaga P. Generating random Gaussian graphical models, arXiv:1909.01062, 2019.
The experiments are related with the analysis of four methods for sampling partial correlation matrices, possibly constrained by an undirected graph:
- The traditional diagonal dominance method, implemented in many software
packages, and also in
gmat::diagdom()
. - Partial orthogonalization (Córdoba et al. 2018), implemented in
gmat::port()
- Uniform sampling (Córdoba et al. 2019), implemented in
gmat::chol_mh()
. - Uniform sampling combined with partial orthogonalization (Córdoba et al.
2019), implemented in
gmat::port_chol()
.
experiment_kramer.R
andplot_kramer.R
: execute the experiment of
N. Krämer, J. Schäfer, and A.-L. Boulesteix. Regularized estimation of
large-scale gene association networks using graphical Gaussian models.
BMC Bioinformatics, 10(1):384, 2009,
whose results are included for comparison in Córdoba et al. (2018, 2019), and generate the corresponding figures.
experiment_pgm.R
andplot_pgm.R
: execute the experiments and generate the figures in Córdoba et al. (2018), except for the Kramer et al. (2009) experiment.plot_ext.R
: generate the figures in Córdoba et al. (2019).
The CRAN packages gmat
and ggplot2
are required for all the experiments
and plots, respectively. The generateds plots are stored in a directory
plot_[experiment-name]
, where experiment-name
may be pgm
, ext
or kramer
,
and which is newly created if it does not already exist.
Source first file experiment_pgm.R
and then plot_pgm.R
. This experiment is computationally intensive,
and requires the dplyr
R package for generating the plots.
Note that
because gmat::port()
and gmat::diagdom()
have been modified since the
publication of Córdoba et al. (2018), some of its original graphics have been
affected. In particular:
- The results for the average off-diagonal/diagonal ratio statistic
R
has changed: matrices obtained with the partial orthogonalization method are more well conditioned, but their behaviour regardingR
is more similar to those with dominant diagonal, although somewhat mitigated. - Now the condition numbers and execution time for
gmat::port()
are lower. - The results for the Kramer experiment with diagonally dominant matrices are slightly different since now the independent and identically distributed original random entries are generated with a Gaussian instead of a uniform distribution.
Source the file experiment_kramer.R
and then plot_kramer.R
.
This experiment is computationally intensive, and requires additional R packages
to be executed: doParallel
, foreach
, parcor
, corpcor
, MASS
and reshape2
.
The performance statistics are calculated by the function in
performance.pcor.R
, which is a modification of
parcor::performance.pcor:
- It solves a bug by calling
GeneNet::network.test.edges()
instead ofGeneNet::ggm.test.edges()
, which does not exist in the newest version ofGeneNet
. - Variables
ppv
andtpr
are correctly initialized to1
instead of-Inf
.