This repo contains the materials of the paper "Beller*, Bennett* & Gerstenberg (2020) The language of causation". For any questions about the repo, feel free to contact Ari Beller at abeller@stanford.edu
.
├── code
│ ├── R
│ ├── bash
│ ├── experiments
│ └── python
├── data
├── figures
│ ├── paper_plots
│ └── trial_schematics
└── videos
Contains code for running the model and searching across parameter settings for optimal model.
-
model.py
includes the underlying physics engine and code to run counterfactual tests. -
compute_aspect_rep.py
will run the model to produce causal representations of the experiment trials. Takes an uncertainty noise value and sample number as command line arguments. -
rsa.py
contains code for the semantics and pragmatics components of the model. The meaning function computes the semantic representations for an utterance, and the l0, s1, l1, s2 functions compute the successive levels of recursive pragmatic reasoning. The lesion_model function computes the no pragmatics model representation. -
grid_search.py
contains tools for running grid searches across for given models across parameter settings. It also contains tools for running cross validation, and saving models to file for analysis in R. -
model_predictions.py
is a script for easy reproduction of model statistics reported in the paper.
Note grid_search.py
and model_predictions.py
only produce model predictions and cross-validation results for the full model and no pragmatics model. Ordinal Regression training, prediction, and parameter selection are computed in R.
Contains code for model analysis and as well as training, prediction, and cross-validation for the Ordinal Regression.
-
forced_choice_expt_analysis.Rmd
analysis script. Can be knitted to reproduce model predictions. -
forced_choice_expt_analysis.md
is a pre-knitted markdown file. You can view the analysis code here. -
crossv_ordreg.R
is a script to produce model predictions for ordinal regression cross-validation. Requires a command line argument specifying the split number for which to compute regression models. Requires considerable time to compute regressions as well as memory space as regressions are saved to file. We performed ordinal regression cross-validation on Stanford's high performance computing cluster Sherlock.
combine_frames.sh
takes a set of frames saved in code/python/figures/frames and produces a video clip saved in code/python/video. Requires experiment name and trial number as command line arguments e.g.
./combine_frames.sh exp_name 5
Also requires ffmpeg multimedia framework.
Frames for video processing be produced using the physics simulator. The following code demonstrates how to produce frames for an arbitrary trial.
aribeller$ cd code/python/
aribeller$ python
Python 3.7.2 (default, Dec 29 2018, 00:00:04)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import model as m
pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
Loading chipmunk for Darwin (64bit) [/Users/aribeller/miniconda3/envs/testenv/lib/python3.7/site-packages/pymunk/libchipmunk.dylib]
>>> trials = m.load_trials("trialinfo/experiment_trials.json")
>>> test_trial = trials[12]
>>> m.run_trial(test_trial, animate=True, save=True)
{'collisions': [{'objects': {'B', 'A'}, 'step': 135}], 'wall_bounces': [], 'button_presses': [], 'outcome': 1, 'outcome_fine': Vec2d(-1060.2315946785948, 300.0)}
Contains the code for our experiment in the folder experiment_forced_choice. For info on how to run the experiment refer to the psiturk documentation.
Contains the raw data file full_database_anonymized.db
Plots presented in the paper.
Diagrams of the trial clips.
Video clips presented to participants in the experiment.
- Install dependencies
- R
- RStudio
- python
conda:
- numpy
- pandas
- scipy
pip:
- pygame==2.0.0.dev6
- pymunk==5.7.0
- Compute aspect representation. This runs the whether, how, sufficiency, and moving tests accross samples of counterfactual simulations. Output will be saved to
code/python/aspects/
. Note that the generated aspects from the paper are already included in this repo incode/python/aspects_paper
. Downstream model components read fromaspects_paper
, but paths can be modified.
cd code/python/
python compute_aspect_rep.py <uncertainty_noise> <num_samples>
- To compute and save models for reported paper statistics use the top_models.py script.
model_predictions.py
produces 4 csvs in theuseful_csvs
folder. The firsttop_models.csv
contains model predictions for the top full model and no pragmatics model (as well as comparisons with and without combined cause respectively). The second and third are thecross_validation_full_model.csv
and thecross_validation_lesion_model.csv
which contain model predictions for for cross validation models trained and tested on the splits specified incrossv_splits.csv
. Lastly, the final csv is theenabled_comparison.csv
which contains predictions for models with and without the "not how" semantics for the "enabled" causal expression. Files are previously computed and saved inuseful_csvs
, but can be re-run with the following code:
cd code/python/
python model_predictions.py
- In RStudio, install packages as needed and then knit
forced_choice_expt_analysis.Rmd
to remake all plots and compute reported statistics. The compiled fileforced_choice_expt_analysis.md
contains all findings reported in the paper.