This repository contains the source code for MIIC (Multivariate Information-based Inductive Causation), a causal discovery method, based on information theory principles, which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The recent more interpretable MIIC extension (iMIIC) further distinguishes genuine causes from putative and latent causal effects, while scaling to very large datasets (hundreds of thousands of samples). Since the version 2.0, MIIC also includes a temporal mode (tMIIC) to learn temporal causal graphs from stationary time series data. MIIC has been applied to a wide range of biological and biomedical data, such as single cell gene expression data, genomic alterations in tumors, live-cell time-lapse imaging data (CausalXtract), as well as medical records of patients. MIIC brings unique insights based on causal interpretation and could be used in a broad range of other data science domains (technology, climatology, economy, ...).
Simon F., Comes M. C., Tocci T., Dupuis L., Cabeli V., Lagrange N., Mencattini A., Parrini M. C., Martinelli E., Isambert H., CausalXtract: a flexible pipeline to extract causal effects from live-cell time-lapse imaging data, eLife 2024.
Ribeiro-Dantas M. D. C., Li H., Cabeli V., Dupuis L., Simon F., Hettal L., Hamy A. S., Isambert H., Learning interpretable causal networks from very large datasets, application to 400,000 medical records of breast cancer patients, iScience, 2024.
Cabeli V., Li H., Ribeiro-Dantas M., Simon F., Isambert H., Reliable causal discovery based on mutual information supremum principle for finite dataset, Why21 at NeurIPS 2021.
Cabeli V., Verny L., Sella N., Uguzzoni G., Verny M., Isambert H., Learning clinical networks from medical records based on information estimates in mixed-type data, PLoS Comput. Biol. 2020 | code
Li H., Cabeli V., Sella N., Isambert H., Constraint-based causal structure learning with consistent separating sets, In Advances in Neural Information Processing Systems 2019. | code.
Verny L., Sella N., Affeldt S., Singh PP., Isambert H., Learning causal networks with latent variables from multivariate information in genomic data, PLoS Comput. Biol. 2017.
Affeldt S., Isambert H., Robust Reconstruction of Causal Graphical Models based on Conditional 2-point and 3-point Information, UAI 2015 | supp.
MIIC contains R and C++ sources.
- To compile from source, a compiler with support for c++14 language features is required.
- MIIC imports the following R packages: ppcor, scales, stats, Rcpp
From CRAN (release):
install.packages("miic")
Or from GitHub (development):
# install.packages("remotes")
remotes::install_github("miicTeam/miic_R_package")
MIIC allows you to create a graph object from a dataset of observations
of both discrete and continuous variables, potentially with missing values
and taking into account unobserved latent variables.
You can find this example along others by calling the documentation
of the main function ?miic
from R.
library(miic)
# EXAMPLE HEMATOPOIESIS
data(hematoData)
# execute MIIC (reconstruct graph)
miic_obj <- miic(
input_data = hematoData, latent = "yes",
n_shuffles = 10, conf_threshold = 0.001
)
# plot graph with igraph
if(require(igraph)) {
plot(miic_obj, method="igraph")
}
You can find the documentation pages in the "man" folder, in the auto generated
PDF,
or use R functions help()
and ?
.
- Tiziana Tocci
- Nikita Lagrange
- Orianne Debeaupuis
- Louise Dupuis
- Franck Simon
- Vincent Cabeli
- Honghao Li
- Marcel Ribeiro Dantas
- Verny Louis
- Sella Nadir
- Séverine Affeldt
- Hervé Isambert
GPL-2 | GPL-3