openNMT-projet pour le cours Traduction Automatique et Assistée
Siman Chen simannnc
Yutian Shen ShenYT0
Xinlei Chen chenxinlei1
test_example.ipynb & test_example.yaml : test opennmt using example data : Europarl10k
split_corpus.ipynb : split corpus data to train, dev and test.
preprocess.ipynb & mose.sh : pre-processed the data using mosedecoder, the shell version is for windows user who can't use perl in jupyter.
lemma.ipynb : lemmatize words using different lemmatizers (WordNetLemmatizer, FrenchLefffLemmatizer, and spaCy)
compare_Spacy.ipynb : compare with nltk and spaCy to find the better lemmatizer
calculate_bert_score.py & train_and_compare.ipynb : train the model and calculate BLEU score and bert score to compare different models.