Skip to content

Latest commit

 

History

History
260 lines (214 loc) · 11.2 KB

README.md

File metadata and controls

260 lines (214 loc) · 11.2 KB

causal-compare

A command-line interface (CLI) for running algorithm comparison tool on simulated data.

Running the program:

Prerequisites - You must have the following installed:

Please follow the Java installation guide for Java SE Development Kit 11.

Execution

Make sure you are using Java 11. You can check by typing the following: java -version

  1. If you are not building from source, you can download the jar file here.
  2. Download the sample configuration file sample_configuration.xml to the same directory as the jar file.
  3. To run the program, open a terminal from the directory in which the jar file is located and type: java -jar causal-compare-x.x.x-jar-with-dependencies.jar --config sample_configuration.xml

Replace the x.x.x with the version number. For an example, causal-compare-0.2.0-jar-with-dependencies.jar

Command-line Options

  --config
    indicate the location of  an XML configuration file
    
  --out
    indicate the location to write files to

  --prefix
    set the prefix name of output files

XML Configuration File

There are two ways of using the comparison tool to compare search algorithms.

The first way is to use Tetrad to generate simulated datasets and run search algorithms on those datasets. You will need to specify the following in the XML configuration file:

File structure for comparison running search algorithms on simulated data:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<comparison>
    <compareBy>
        <!--compare by search algorithms-->
        <search>

            <!--list of data simulations-->
            <simulations>
            
                <!--run simulation to generate data-->
                <simulation source="generate">
                    <graphtype>...</graphtype>
                    <modeltype>...</modeltype>
                </simulation>
            </simulations>

            <!--list of search algorithms-->
            <algorithms>
                <algorithm name="...">...</algorithm>
                <algorithm name="...">...</algorithm>
            </algorithms>

            <!--list of search algorithm parameters-->
            <parameters>
                <parameter name="...">...</parameter>
                <parameter name="...">...</parameter>
            </parameters>
        </search>
    </compareBy>

    <!--list of comparison statistics-->
    <statistics>
        <statistic>...</statistic>
        <statistic>...</statistic>
    </statistics>

    <!--list of comparison tool properties-->
    <properties>
        <property name="...">true</property>
        <property name="...">true</property>
    </properties>
</comparison>

You can use Tetrad to generate simulated datasets and use them later in the comparison tool. You just need to specify the path to where the simulate datasets are saved in the XML file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<comparison>
   <compareBy>
       <!--compare by search algorithms-->
       <search>

           <!--list of data simulations-->
           <simulations>
               
               <!--path to where data simulation are saved-->
               <simulation source="directory">
                   <path>...</path>
               </simulation>
           </simulations>
       </search>
   </compareBy>
   
   ...
   
</comparison>

The second way is to compare from result graphs. You need to do the following:

  1. Run Tetrad to simulate some datasets and save them to a folder. Click here for tutorial.
  2. Run search algorithms on other platforms on that datasets.
  3. Save the result graphs.

Note that the true graphs and result graphs have to be in a Tetrad graph format. You will need to specify the following in the XML configuration file:

File structure for comparison using result graphs obtained from other search algorithms running on Tetrad simulated data:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<comparison>
    <compareBy>
        <!--compare by graphs obtained from search algorithm-->
        <graph>
            <!--path to true graph file-->
            <trueGraph>...</trueGraph>

            <!--path to where data simulation are saved-->
            <simulationPath>...</simulationPath>

            <!--list of result graphs-->
            <resultGraphs>
                <graph>
                    <!--description for the result graph-->
                    <description>...</description>

                    <!--the time it takes for the search algorithm to finish-->
                    <elapseTime>...</elapseTime>

                    <!--path to the result graph file-->
                    <graphFile>...</graphFile>
                </graph>
                <graph>
                    <description>...</description>
                    <elapseTime>...</elapseTime>
                    <graphFile>...</graphFile>
                </graph>
            </resultGraphs>
        </graph>
    </compareBy>

    <!--list of comparison statistics-->
    <statistics>
        <statistic>...</statistic>
        <statistic>...</statistic>
    </statistics>

    <!--list of comparison tool properties-->
    <properties>
        <property name="...">...</property>
        <property name="...">...</property>
    </properties>
</comparison>

Building the software

If you prefer to compile the code, please follow the instruction below.

Prerequisites - You must have the following installed:

Please follow the Git, Java installation guide for Java SE Development Kit 11 and the Maven installation guide.

Compiling the code from command-line:

Download the source code: $ git clone https://github.com/bd2kccd/causal-compare.git

Go to the project directory: $ cd causal-compare

Build the jar file from the source code: [causal-compare]$ mvn clean package

The jar file, causal-compare-x.x.x-jar-with-dependencies.jar, is in the target directory.

Configuration Options

Comparison Statistics

Statistic Description
AR Adjacency Recall
AHP Arrowhead precision
AHR Arrowhead recall
AHPC Arrowhead precision (common edges)
AHRC Arrowhead recall (common edges)
ATN Adjacency True Negatives
ATP Adjacency True Positives
ATPR Adjacency True Positive Rate
AFN Adjacency False Negatives
AFP Adjacency False Positives
AHTN Arrowhead True Negatives
AHTP Arrowhead True Positives
F1Adj F1 statistic for adjacencies
F1All F1 statistic for adjacencies and orientations combined
F1Arrow F1 statistic for arrows
McAdj Matthew's correlation coefficient for adjacencies
McArrow Matthew's correlation coefficient for arrowheads
SHD Structural Hamming Distance
NICP Node in cycle precision
NICR Node in cycle recall
AMB Number of Ambiguous Triples
%AMB Percent Ambiguous Triples
BID Percent Bidirected Edges
EdgesEst Number of Edges in the Estimated Graph
EdgesT Number of Edges in the True Graph
TP Tail precision
TR Tail recall
2CP 2-cycle precision
2CR 2-cycle recall
E Elapsed Time

Comparison Properties

Property Name Description
setShowSimulation True if simulation indices should be shown in the comparison table, false if not
setShowAlgorithmIndices True if algorithm indices should be shown in the comparison table, false if not.
setShowUtilities True if utilities should be shown in the comparison table, false if not
setSortByUtility True if results should be sorted high to low by utility, false if not.
setSavePatterns True if patterns of DAGs should be saved out with the results.
setSavePags True if PAGs (partial ancestral graphs) should be saved out with the results.
setTabDelimitedTables True if tables should be output in tab-delimited form, false if they should be printed in space-delimited form with aligned columns.
setComparisonGraph Sets the type of graph results are compared to. The options are: true DAG, pattern of the true DAG, PAG o the true DAG

Tetrad Graph Format

A Tetrad graph.txt file contains the following:

  • A list of variables.
  • A list of edges.

Below is an example of a Tetrad graph containg variables x, y and z:

Graph Nodes:
x;y;z

Graph Edges:
1. x --> y
2. z --> x
3. z --> y