You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: english/grammatical_error_correction.md
+20-12Lines changed: 20 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -12,18 +12,21 @@ GEC is usually formulated as a sentence-to-sentence correction task. A GEC syste
12
12
13
13
The CoNLL-2014 shared task test set (https://www.comp.nus.edu.sg/~nlp/conll14st/conll14st-test-data.tar.gz) is the most widely used dataset to benchmark GEC systems. The test set contains 1,312 english sentences with error annotations by 2 expert annotators. Models are evaluated with MaxMatch scorer ([Dahlmeier and Ng, 2012](http://www.aclweb.org/anthology/N12-1067)) which is a phrase-level F<sub>β</sub>-score with β=0.5 that weights precision twice as recall.
14
14
15
-
The shared task setting restricts that systems use only publicly available datasets for training to fairer comparisons. The best published results on the CoNLL-2014 test set are given below. A distinction is made between papers that report results in the restricted CoNLL-2014 shared task setting using publicly-available training datasets (_Restricted_) and those that made use of large non-public datasets (_Unrestricted_).
15
+
The shared task setting restricts that systems use only publicly available datasets for training to fairer comparisons. The best published results on the CoNLL-2014 test set are given below. A distinction is made between papers that report results in the restricted CoNLL-2014 shared task setting using publicly-available training datasets (_**Restricted**_) and those that made use of large non-public datasets (_**Unrestricted**_).
| CNN Seq2Seq (Chollampatt and Ng, 2018)| 54.79 |[A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137)|[Official](https://github.com/nusnlp/mlconvgec2018)|
25
25
|_**Unrestricted**_|
26
-
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 61.34 |[Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
26
+
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 61.34 |[Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/pdf/1807.01270.pdf)| NA |
27
+
28
+
_**Restricted**_: uses only publicly available datasets. _**Unrestricted**_: uses non-public datasets.
29
+
27
30
28
31
### CoNLL-2014 10 Annotations
29
32
@@ -32,22 +35,27 @@ The shared task setting restricts that systems use only publicly available datas
32
35
| Model | F0.5 | Paper / Source | Code |
33
36
| ------------- | :-----:| --- | :-----: |
34
37
|_**Restricted**_|
35
-
| SMT + BiGRU (Grundkiewicz et al., 2018) | 72.04 |[Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
36
-
| CNN Seq2Seq (Chollampatt & Ng, 2018)| 70.14 (measured by Ge et al., 2018) |[ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)|[Official](https://github.com/nusnlp/mlconvgec2018)|
38
+
| SMT + BiGRU (Grundkiewicz and Junczys-Dowmunt, 2018) | 72.04 |[Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](http://aclweb.org/anthology/N18-2046)| NA |
39
+
| CNN Seq2Seq (Chollampatt and Ng, 2018)| 70.14 (measured by Ge et al., 2018) |[ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137)|[Official](https://github.com/nusnlp/mlconvgec2018)|
37
40
|_**Unrestricted**_|
38
-
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 76.88 |[Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
41
+
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 76.88 |[Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/pdf/1807.01270.pdf)| NA |
42
+
43
+
_**Restricted**_: uses only publicly available datasets. _**Unrestricted**_: uses non-public datasets.
39
44
40
45
41
46
### JFLEG
42
47
43
-
[JFLEG corpus](https://github.com/keisks/jfleg) by [Napoles et al., 2017](https://arxiv.org/abs/1702.04066) consists of 1,511 english sentences with annotations. Models are evaluated with [GLEU metric](https://arxiv.org/abs/1609.08144).
48
+
[JFLEG test set](https://github.com/keisks/jfleg)released by [Napoles et al., 2017](http://aclweb.org/anthology/E17-2037) consists of 747 english sentences with 4 references for each sentence. Models are evaluated with [GLEU](https://github.com/cnap/gec-ranking/) metric ([Napoles et al., 2016](https://arxiv.org/pdf/1605.02592.pdf)).
44
49
45
50
| Model | GLEU | Paper / Source | Code |
46
51
| ------------- | :-----:| --- | :-----: |
47
52
|_**Restricted**_|
48
-
| SMT + BiGRU (Grundkiewicz et al., 2018) | 61.50 |[Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
53
+
| SMT + BiGRU (Grundkiewicz and Junczys-Dowmunt, 2018) | 61.50 |[Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](http://aclweb.org/anthology/N18-2046)| NA |
49
54
| Transformer (Junczys-Dowmunt et al., 2018) | 59.9 |[Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task](http://aclweb.org/anthology/N18-1055)| NA |
50
-
| CNN Seq2Seq (Chollampatt & Ng, 2018)| 57.47 |[ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)|[Official](https://github.com/nusnlp/mlconvgec2018)|
55
+
| CNN Seq2Seq (Chollampatt and Ng, 2018)| 57.47 |[ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137)|[Official](https://github.com/nusnlp/mlconvgec2018)|
51
56
|_**Unrestricted**_|
52
-
| CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) | 62.37 |[Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
57
+
| CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) | 62.37 |[Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/pdf/1807.01270.pdf)| NA |
58
+
59
+
_**Restricted**_: uses only publicly available datasets. _**Unrestricted**_: uses non-public datasets.
0 commit comments