Main paper to be cited (Goo et al., 2018)
@inproceedings{goo2018slot,
title={Slot-Gated Modeling for Joint Slot Filling and Intent Prediction},
author={Chih-Wen Goo and Guang Gao and Yun-Kai Hsu and Chih-Li Huo and Tsung-Chieh Chen and Keng-Wei Hsu and Yun-Nung Chen},
booktitle={Proceedings of The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
year={2018}
}
Enter --dataset=atis
or --dataset=snips
to use ATIS or Snips (Coucke et al., 2018) dataset.
You need to put your dataset under ./data/ and use --dataset=foldername
.
For example, your dataset is ./data/mydata, then you need to enter --dataset=mydata
Your dataset should be seperated to three folders - train, test, and valid, which is named 'train', 'test', and 'valid' by default setting of train.py.
Each of these folders contain three files - word sequence, slot label, and intent label, which is named 'seq.in', 'seq.out', and 'label' by default setting of train.py.
For example, the full path to train/slot_label_file is './data/mydata/train/seq.out' .
Each line represents an example, and slot label should use the IBO format.
Vocabulary files will be generated by utils.createVocabulary() automatically
You may see ./data/atis for more detail.
tensorflow 1.4
python 3.5
some sample usage
-
run with 32 units, atis dataset and no patience for early stop
python3 train.py --num_units=32 --dataset=atis --patience=0 -
disable early stop, use snips dataset and use intent attention version
python3 train.py --no_early_stop --dataset=snips --model_type=intent_only -
use "python3 train.py -h" for all avaliable parameter settings
-
Note: must type
--dataset
. If you don't want to use this flag, type--dataset=''
instead.