Skip to content

Commit f80ca6b

Browse files
authored
Update grammatical_error_correction.md
corrected the task description, added two settings for fairer comparison of state-of-the-art
1 parent fbe391d commit f80ca6b

File tree

1 file changed

+22
-11
lines changed

1 file changed

+22
-11
lines changed
Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,52 @@
11
# Grammatical Error Correction
22

3-
Grammatical Error Correction (GEC) is the task of correcting grammatical mistakes in a sentence.
3+
Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, grammatical, and word choice errors.
44

5+
GEC is usually formulated as a sentence-to-sentence correction task. A GEC system takes a potentially erroneous sentence as input and is expected to transform it into its corrected version. See the example given below:
56

6-
| Error | Corrected |
7-
| ------------- | ------------- |
7+
| Input (Erroneous) | Output (Corrected) |
8+
| ------------------------- | ---------------------- |
89
|She see Tom is catched by policeman in park at last night. | She saw Tom caught by a policeman in the park last night.|
910

10-
### CoNLL-2014
11+
### CoNLL-2014 Shared Task
12+
13+
The CoNLL-2014 shared task test set (https://www.comp.nus.edu.sg/~nlp/conll14st/conll14st-test-data.tar.gz) is the most widely used dataset to benchmark GEC systems. The test set contains 1,312 english sentences with error annotations by 2 expert annotators. Models are evaluated with MaxMatch scorer ([Dahlmeier and Ng, 2012](http://www.aclweb.org/anthology/N12-1067)) which is a phrase-level F<sub>β</sub>-score with β=0.5 that weights precision twice as recall.
14+
15+
The shared task setting restricts that systems use only publicly available datasets for training to fairer comparisons. The best published results on the CoNLL-2014 test set are given below. A distinction is made between papers that report results in the restricted CoNLL-2014 shared task setting using publicly-available training datasets (_Restricted_) and those that made use of large non-public datasets (_Unrestricted_).
1116

12-
CoNLL-14 benchmark is done on the [test split](https://www.comp.nus.edu.sg/~nlp/conll14st/conll14st-test-data.tar.gz) of [NUS Corpus of Learner English/NUCLE](https://www.comp.nus.edu.sg/~nlp/corpora.html) dataset.
13-
CoNLL-2014 test set contains 1,312 english sentences with grammatical error correction annotations by 2 annotators. Models are evaluated with [F-score](https://en.wikipedia.org/wiki/F1_score) with β=0.5 which weighs precision twice as recall.
1417

1518
| Model | F0.5 | Paper / Source | Code |
1619
| ------------- | :-----:| --- | :-----: |
17-
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 61.34 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
20+
|_**Restricted**_ |
1821
| SMT + BiGRU (Grundkiewicz et al., 2018) | 56.25 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
1922
| Transformer (Junczys-Dowmunt et al., 2018) | 55.8 | [Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task](http://aclweb.org/anthology/N18-1055)| NA |
2023
| CNN Seq2Seq (Chollampatt & Ng, 2018)| 54.79 | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)| [Official](https://github.com/nusnlp/mlconvgec2018) |
24+
|_**Unrestricted**_ |
25+
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 61.34 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
2126

22-
### CoNLL-2014 10 Annotators
27+
### CoNLL-2014 10 Annotations
2328

24-
[Bryant and Ng 2015](https://pdfs.semanticscholar.org/f76f/fd242c3dc88e52d1f427cdd0f5dccd814937.pdf) used 10 annotators to do grammatical error correction on CoNll-14's [1312 sentences](http://www.comp.nus.edu.sg/~nlp/sw/10gec_annotations.zip).
29+
[Bryant and Ng, 2015](https://pdfs.semanticscholar.org/f76f/fd242c3dc88e52d1f427cdd0f5dccd814937.pdf) released 8 additional annotations (in addition to the two official annotations) for the CoNLL-2014 shared task test set (http://www.comp.nus.edu.sg/~nlp/sw/10gec_annotations.zip).
2530

2631
| Model | F0.5 | Paper / Source | Code |
2732
| ------------- | :-----:| --- | :-----: |
28-
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 76.88 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
33+
|_**Restricted**_ |
2934
| SMT + BiGRU (Grundkiewicz et al., 2018) | 72.04 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
3035
| CNN Seq2Seq (Chollampatt & Ng, 2018)| 70.14 (measured by Ge et al., 2018) | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)| [Official](https://github.com/nusnlp/mlconvgec2018) |
36+
|_**Unrestricted**_ |
37+
| CNN Seq2Seq + Fluency Boost (Ge et al., 2018) | 76.88 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
38+
3139

3240
### JFLEG
3341

3442
[JFLEG corpus](https://github.com/keisks/jfleg) by [Napoles et al., 2017](https://arxiv.org/abs/1702.04066) consists of 1,511 english sentences with annotations. Models are evaluated with [GLEU metric](https://arxiv.org/abs/1609.08144).
3543

3644
| Model | GLEU | Paper / Source | Code |
3745
| ------------- | :-----:| --- | :-----: |
38-
| CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) | 62.37 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
46+
|_**Restricted**_ |
3947
| SMT + BiGRU (Grundkiewicz et al., 2018) | 61.50 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](https://arxiv.org/abs/1804.05945)| NA |
4048
| Transformer (Junczys-Dowmunt et al., 2018) | 59.9 | [Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task](http://aclweb.org/anthology/N18-1055)| NA |
4149
| CNN Seq2Seq (Chollampatt & Ng, 2018)| 57.47 | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://arxiv.org/abs/1801.08831)| [Official](https://github.com/nusnlp/mlconvgec2018) |
50+
|_**Unrestricted**_ |
51+
| CNN Seq2Seq + Fluency Boost and inference (Ge et al., 2018) | 62.37 | [Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study](https://arxiv.org/abs/1807.01270)| NA |
52+

0 commit comments

Comments
 (0)