Skip to content

Latest commit

 

History

History
46 lines (30 loc) · 1.28 KB

README.md

File metadata and controls

46 lines (30 loc) · 1.28 KB

Coherence Modeling with TDA

Beyond Words: A Topological Exploration of Coherence in Text Documents

The Second Tiny Papers Track at ICLR 2024

Setup

conda env create -f environment.yml
conda activate tda-modeling-env

Usage

Computing TDA features for a dataset:

python feature_gen.py --cuda 0 --data_name clinton_train --input_dir GCDC_Dataset/ --output_dir gcdc_tda_features --batch_size 100

Train/test MLP using generated TDA features:

python predict_tda.py --input_dir GCDC_Dataset/ --feat_dir gcdc_tda_features/  --domain clinton

Data

Acknowledgements

We thank the authors of Artificial Text Detection via Examining the Topology of Attention Maps (EMNLP 2021) for publishing their code.

Citing

If you find this code useful for your research, please cite the following paper:

@inproceedings{singhal2024beyond,
  title={Beyond Words: A Topological Exploration of Coherence in Text Documents},
  author={Singhal, Rishi and Jain, Samyak  and Krishna, Sriram and Singla, Yaman K and Shah, Rajiv Ratn},
  booktitle={The Second Tiny Papers Track at ICLR 2024}
}