|
1 | 1 | # Grammatical Error Correction
|
2 | 2 |
|
3 |
| -Grammatical Error Correction (GEC) is the task of correcting grammatical mistakes in a sentence. |
| 3 | +Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, grammatical, and word choice errors. |
4 | 4 |
|
| 5 | +GEC is usually formulated as a sentence-to-sentence correction task. A GEC system takes a potentially erroneous sentence as input and is expected to transform it into its corrected version. See the example given below: |
5 | 6 |
|
6 |
| -| Error | Corrected | |
7 |
| -| ------------- | ------------- | |
| 7 | +| Input (Erroneous) | Output (Corrected) | |
| 8 | +| ------------------------- | ---------------------- | |
8 | 9 | |She see Tom is catched by policeman in park at last night. | She saw Tom caught by a policeman in the park last night.|
|
9 | 10 |
|
10 |
| -### CoNLL-2014 |
| 11 | +### CoNLL-2014 Shared Task |
| 12 | + |
| 13 | +The CoNLL-2014 shared task test set (https://www.comp.nus.edu.sg/~nlp/conll14st/conll14st-test-data.tar.gz) is the most widely used dataset to benchmark GEC systems. The test set contains 1,312 english sentences with error annotations by 2 expert annotators. Models are evaluated with MaxMatch scorer ([Dahlmeier and Ng, 2012](http://www.aclweb.org/anthology/N12-1067)) which is a phrase-level F<sub>β</sub>-score with β=0.5 that weights precision twice as recall. |
| 14 | + |
| 15 | +The shared task setting restricts that systems use only publicly available datasets for training to fairer comparisons. The best published results on the CoNLL-2014 test set are given below. A distinction is made between papers that report results in the restricted CoNLL-2014 shared task setting using publicly-available training datasets (_Restricted_) and those that made use of large non-public datasets (_Unrestricted_). |
11 | 16 |
|
12 |
| -CoNLL-14 benchmark is done on the [test split](https://www.comp.nus.edu.sg/~nlp/conll14st/conll14st-test-data.tar.gz) of [NUS Corpus of Learner English/NUCLE](https://www.comp.nus.edu.sg/~nlp/corpora.html) dataset. |
13 |
| -CoNLL-2014 test set contains 1,312 english sentences with grammatical error correction annotations by 2 annotators. Models are evaluated with [F-score](https://en.wikipedia.org/wiki/F1_score) with β=0.5 which weighs precision twice as recall. |
14 | 17 |
|
15 | 18 | | Model | F0.5 | Paper / Source | Code |
|
16 | 19 | | ------------- | :-----:| --- | :-----: |
|
17 |
| -| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 61.34 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA | |
| 20 | +|_**Restricted**_ | |
18 | 21 | | SMT + BiGRU (Grundkiewicz et al., 2018) | 56.25 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
|
19 | 22 | | Transformer (Junczys-Dowmunt et al., 2018) | 55.8 | [Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task](http://aclweb.org/anthology/N18-1055)| NA |
|
20 | 23 | | CNN Seq2Seq (Chollampatt & Ng, 2018)| 54.79 | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)| [Official](https://github.com/nusnlp/mlconvgec2018) |
|
| 24 | +|_**Unrestricted**_ | |
| 25 | +| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 61.34 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA | |
21 | 26 |
|
22 |
| -### CoNLL-2014 10 Annotators |
| 27 | +### CoNLL-2014 10 Annotations |
23 | 28 |
|
24 |
| -[Bryant and Ng 2015](https://pdfs.semanticscholar.org/f76f/fd242c3dc88e52d1f427cdd0f5dccd814937.pdf) used 10 annotators to do grammatical error correction on CoNll-14's [1312 sentences](http://www.comp.nus.edu.sg/~nlp/sw/10gec_annotations.zip). |
| 29 | +[Bryant and Ng, 2015](https://pdfs.semanticscholar.org/f76f/fd242c3dc88e52d1f427cdd0f5dccd814937.pdf) released 8 additional annotations (in addition to the two official annotations) for the CoNLL-2014 shared task test set (http://www.comp.nus.edu.sg/~nlp/sw/10gec_annotations.zip). |
25 | 30 |
|
26 | 31 | | Model | F0.5 | Paper / Source | Code |
|
27 | 32 | | ------------- | :-----:| --- | :-----: |
|
28 |
| -| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 76.88 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA | |
| 33 | +|_**Restricted**_ | |
29 | 34 | | SMT + BiGRU (Grundkiewicz et al., 2018) | 72.04 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
|
30 | 35 | | CNN Seq2Seq (Chollampatt & Ng, 2018)| 70.14 (measured by Ge et al., 2018) | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)| [Official](https://github.com/nusnlp/mlconvgec2018) |
|
| 36 | +|_**Unrestricted**_ | |
| 37 | +| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 76.88 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA | |
| 38 | + |
31 | 39 |
|
32 | 40 | ### JFLEG
|
33 | 41 |
|
34 | 42 | [JFLEG corpus](https://github.com/keisks/jfleg) by [Napoles et al., 2017](https://arxiv.org/abs/1702.04066) consists of 1,511 english sentences with annotations. Models are evaluated with [GLEU metric](https://arxiv.org/abs/1609.08144).
|
35 | 43 |
|
36 | 44 | | Model | GLEU | Paper / Source | Code |
|
37 | 45 | | ------------- | :-----:| --- | :-----: |
|
38 |
| -| CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) | 62.37 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA | |
| 46 | +|_**Restricted**_ | |
39 | 47 | | SMT + BiGRU (Grundkiewicz et al., 2018) | 61.50 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
|
40 | 48 | | Transformer (Junczys-Dowmunt et al., 2018) | 59.9 | [Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task](http://aclweb.org/anthology/N18-1055)| NA |
|
41 | 49 | | CNN Seq2Seq (Chollampatt & Ng, 2018)| 57.47 | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)| [Official](https://github.com/nusnlp/mlconvgec2018) |
|
| 50 | +|_**Unrestricted**_ | |
| 51 | +| CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) | 62.37 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA | |
| 52 | + |
0 commit comments