The implementation of the model proposed in the Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing paper
Python 3+ with pip
To install the python modules required to train and run the model
pip install -r requirements.txt
To download and pre-process the multi-domain belief tracking dialogues and pre-trained word embeddings
python preprocess.py
To train the model
python main.py train [--args=value]
Some of these args include:
--num_hid: the size of the hidden layers - default is 50 (the paper uses 100)--bidir/--no-bidir: bidirectional vs forward only - default is bidirectional (only valid for a RNNs)--net_type: the type of the feature encoders,grufor GRU,cnnfor CNN,lstmfor LSTM - default islstm(See the paper)--batch_size: the batch size - default is 64--dev: device use to train the model (cpuorgpu) - default isgpu--model_url: path to save the model or to resume training - default ismodels/model-1--graph_url: path to save the tensorboard graph for evaluation metrics (cross-entropy, accuracy ..etc) - default isgraphs/graph-1
For example to train the cnn variant to the model with 8 batch size on the GPU
python main.py train --batch_size=8 --net_type=cnn --dev=gpu
To evaluate the model on the test dataset
python main.py test [--args=value]
This uses the same arguments as above, except --dev is not included as cpu is used for testing. It generates a log file at results/log-1.txt, which includes the original dialogues with the true labels and the model predictions.
Currently there is a bug that shuffles the model predictions across dialogues. So to go around it, make sure to use a batch size of 1, i.e. --batch_size=1.