To be more accurate, this implementation is just a line-by-line translation from the DyNet implementation that can be found here. The techniques behind the parser are described in the paper Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations.
- Python 2.7 interpreter (For Python 3 implementation, please checkout branch pytorch_python3, Thanks to Zhiqiang Xie)
- Pytorch library
The software requires having a training.conll and development.conll files formatted according to the CoNLL data format, or a training.conllu and development.conllu files formatted according to the CoNLLU data format.
python src/parser.py --outdir [results directory] --train data/en-universal-train.conll --dev data/en-universal-dev.conll --epochs 30 --lstmdims 125 --bibi-lstm
The command for parsing a test.conll file formatted according to the CoNLL data format with a previously trained model is:
python src/parser.py --predict --outdir [results directory] --test data/en-universal-test.conll --model [trained model file] --params [param file generate during training]
The parser will store the resulting conll file in the out directory (--outdir).