Merge pull request #28 from ryskina/master

Huang17 · web-flow · commit cddb958ad15b · 2020-07-27T18:01:38.000-07:00
Updated coreference resolution and word segmentation
diff --git a/docs/co-reference_resolution.md b/docs/co-reference_resolution.md
@@ -24,13 +24,12 @@ Average of F1-scores returned by these three precision/recall metrics:
 - B-cubed.  
 - Entity-based CEAF.  
 
-
 ## <span class="t">CoNLL 2012 Co-reference task</span>.
 
 CoNLL 2012 introduced a co-reference task in Chinese.
 - http://conll.cemantix.org/2012/introduction.html 
 
-Data for this evaluation is part of Ontonotes, distributed by the Linguistic Data Consortium (LDC).
+Data for this evaluation is part of OntoNotes, distributed by the Linguistic Data Consortium (LDC).
 - https://catalog.ldc.upenn.edu/LDC2013T19 
 
 |  Test set | # of co-referring mentions | Genre |
@@ -47,16 +46,44 @@ Scoring code: https://github.com/conll/reference-coreference-scorers
 
 |  System | Average F1 of MUC, B-cubed, CEAF |
 | --- | --- |
-|  [[Clark & Manning, 2016](https://nlp.stanford.edu/static/pubs/clark2016deep.pdf)] | 63.88 |
-|  [[Clark & Manning, 2016](https://nlp.stanford.edu/static/pubs/clark2016improving.pdf)] | 63.66 |
+|  [Kong & Jian (2019)](https://www.ijcai.org/Proceedings/2019/700) | 63.85 |
+|  [Clark & Manning (2016b)](https://nlp.stanford.edu/static/pubs/clark2016deep.pdf) | 63.88 |
+|  [Clark & Manning (2016a)](https://nlp.stanford.edu/static/pubs/clark2016improving.pdf) | 63.66 |
 
 ### Resources
 
-Data for this evaluation is part of Ontonotes, distributed by the Linguistic Data Consortium (LDC).
+Data for this evaluation is part of OntoNotes, distributed by the Linguistic Data Consortium (LDC).
 - https://catalog.ldc.upenn.edu/LDC2013T19 
 
 ---
 
+## <span class="t">Subtask: zero pronoun resolution (CoNLL 2012 / OntoNotes 5.0) </span>.
+
+### Metrics
+
+F1 score computed on resolution hits ([Zhao & Ng 2007](https://www.aclweb.org/anthology/D07-1057.pdf)).
+
+### Results
+
+|  System | Overall F1 (w/ gold syntactic info) | Overall F1 (w/o gold syntactic info) |
+| --- | --- | --- |
+|  [Aloraini & Poesio (2020)](https://www.aclweb.org/anthology/2020.lrec-1.11/) | 63.5 | |
+|  [Song et al. (2020)](https://www.aclweb.org/anthology/2020.acl-main.482/) | 58.5 | 26.1 |
+|  [Yang et al. (2019)](https://www.aclweb.org/anthology/W19-4108/) | 58.1 | |
+|  [Yin et al. (2018)](https://www.aclweb.org/anthology/C18-1002/) | 57.3 | |
+|  [Liu et al. (2017)](https://www.aclweb.org/anthology/P17-1010/) | 55.3 | |
+|  [Yin et al. (2017)](https://www.aclweb.org/anthology/D17-1135/) | 54.9 | 22.7 |
+
+### Resources
+
+Training and testing is performed on the train and dev splits of OntoNotes 5.0 respectively (statistics reported by [Yin et al. (2018)](https://www.aclweb.org/anthology/C18-1002/))
+
+| Split | Documents | Sentences | Words | Anaphoric Zero Pronouns | 
+| --- | --- | --- | --- | --- |
+|  Train | 1,391 | 36,487 | 756K | 12,111 |
+|  Dev | 172 | 6,083 | 110K | 1,713 |
+
+
 **Suggestions? Changes? Please send email to [chinesenlp.xyz@gmail.com](mailto:chinesenlp.xyz@gmail.com)**
 
 
diff --git a/docs/word_segmentation.md b/docs/word_segmentation.md
@@ -52,11 +52,14 @@ F1 = 0.857
 
 |  Model | AS | CITYU | MSR | PKU |
 | --- | --- | --- | --- | --- |
-|  [Meng et al. (2019)](https://arxiv.org/pdf/1901.10125.pdf) | 96.7 | 97.9 | 98.3 | 96.7 |
-|[Huang et al. (2019)](https://arxiv.org/pdf/1903.04190.pdf)|96.6|97.6|97.9|96.6|
+|  [Tian, Song, Xia, Zhang, Wang (2020)](https://www.aclweb.org/anthology/2020.acl-main.734/) | 96.6 | 97.9 | 98.4 | 96.5 |
+|  [Meng et al. (2019)](https://arxiv.org/abs/1901.10125) | 96.7<sup>*</sup> | 97.9<sup>*</sup> | 98.3 | 96.7 |
+|  [Huang et al. (2019)](https://arxiv.org/abs/1903.04190)| 96.6 | 97.6 | 97.9 | 96.6 |
 |  [Ma et al. (2018)](http://aclweb.org/anthology/D18-1529) | 96.2 | 97.2 | 97.4 | 96.1 |
 |  [Yang et al. (2017)](http://aclweb.org/anthology/P17-1078) | 95.7 | 96.9 | 97.5 | 96.3 |
-|  [Zhou et al. (2017)](https://www.aclweb.org/anthology/D17-1079) |  |  | 97.8 | 96 |
+|  [Zhou et al. (2017)](https://www.aclweb.org/anthology/D17-1079) |  |  | 97.8 | 96.0 |
+
+<sup>*</sup> Unlike others, [Meng et al. (2019)](https://arxiv.org/pdf/1901.10125.pdf) do not report converting traditional Chinese to simplified Chinese.
 
 ### Resources
 
@@ -70,36 +73,40 @@ F1 = 0.857
 
 ## <span class="t">Chinese Penn Treebank</span>.
 
-* [Website](https://verbs.colorado.edu/chinese/ctb.html)
-* Includes 2 datasets:
+* [Website](https://www.cs.brandeis.edu/~clp/ctb/)
+* Includes 3 datasets:
   * [CTB6](https://catalog.ldc.upenn.edu/LDC2007T36): consisting of 780,000 words (over 1.28 million Chinese characters)
   * [CTB7](https://catalog.ldc.upenn.edu/LDC2010T07): consists of 2,448 text files, 51,447 sentences, 1,196,329 words and 1,931,381 hanzi (Chinese characters)
-
+  * [CTB9](https://catalog.ldc.upenn.edu/LDC2016T13): consists of 3,726 text files, 132,076 sentences, 2,084,387 words, 3,247,331 characters (hanzi or foreign)
 
 |Data set|Test set (Tokens)|
 | ---: | ---: |
-|CTB6|81,578|
-|CTB7|81,578|
+|CTB6|82K|
+|CTB7|245K|
+|CTB9|242K|
 
 ### Results
 
-|  Model | CTB6 | CBT7 |
-| --- | --- | --- |
-|[Huang et al. (2019)](https://arxiv.org/pdf/1903.04190.pdf)|97.6|96.6|
-| [Meng et al. (2019)](https://arxiv.org/pdf/1901.10125.pdf) | 96.6 |  |
-| [Ma et al. (2018)](http://aclweb.org/anthology/D18-1529) | 96.7 | 96.6 |
-| [Yang et al. (2017)](http://aclweb.org/anthology/P17-1078) | 96.2 |  |
-| [Zhou et al. (2017)](https://www.aclweb.org/anthology/D17-1079) | 96.2 |  |
-
+|  Model | CTB6 | CTB7 | CTB9 |
+| --- | --- | --- | --- |
+| [Tian, Song, Ao, Xia, Quan, Zhang, Wang (2020)](https://www.aclweb.org/anthology/2020.acl-main.735/) | 97.5 | 97.3 | 97.8 |
+| [Tian, Song, Xia, Zhang, Wang (2020)](https://www.aclweb.org/anthology/2020.acl-main.734/) | 97.3 | |
+| [Yan et al. (2020)](https://transacl.org/ojs/index.php/tacl/article/view/1876) | | 97.1| 97.6 |
+| [Huang et al. (2019)](https://arxiv.org/abs/1903.04190)|97.6| | |
+| [Ma et al. (2018)](http://aclweb.org/anthology/D18-1529) | 96.7 | 96.6<sup>**</sup> | |
+| [Yang et al. (2017)](http://aclweb.org/anthology/P17-1078) | 96.2 |  | |
+| [Zhou et al. (2017)](https://www.aclweb.org/anthology/D17-1079) | 96.2 | | |
 
+<sup>**</sup> [Ma et al. (2018)](http://aclweb.org/anthology/D18-1529) report different statistics for their CTB7 split (950K/60K/82K), so the results might not be comparable.
 
 
 ### Resources
 
-|  Train set | Training Size(Words) |
+|  Train set | Training Size (Words) |
 | --- | ----: |
-|  CTB6 | 641,368 |
-|  CTB7 | 950,138 |
+|  CTB6 | 641K |
+|  CTB7 | 718K |
+|  CTB9 | 1,696K |
 
 
 ## <span class="t">Chinese Universal Treebank (UD)</span>.
@@ -114,7 +121,8 @@ F1 = 0.857
 
 |  Model | UD |
 | --- | --- | 
-| [Huang et al. (2019)](https://arxiv.org/pdf/1903.04190.pdf)|97.3 |
+| [Tian, Song, Ao, Xia, Quan, Zhang, Wang (2020)](https://www.aclweb.org/anthology/2020.acl-main.735/) | 98.3 |
+| [Huang et al. (2019)](https://arxiv.org/abs/1903.04190)|97.3 |
 | [Ma et al. (2018)](http://aclweb.org/anthology/D18-1529) | 96.9 |
 
 ### Resources
@@ -137,7 +145,6 @@ F1 = 0.857
 
 |  Model | Weibo |
 | --- | --- | 
-| [Meng et al. (2019)](https://arxiv.org/pdf/1901.10125.pdf) | 96.0 |  
 | [Yang et al. (2017)](http://aclweb.org/anthology/P17-1078) | 95.5 |