Skip to content

Commit 67a4537

Browse files
authored
Update multilingual.md
Correct Wikipedia size correlation comment.
1 parent 0fce551 commit 67a4537

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

multilingual.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ Note that the English result is worse than the 84.2 MultiNLI baseline because
6969
this training used Multilingual BERT rather than English-only BERT. This implies
7070
that for high-resource languages, the Multilingual model is somewhat worse than
7171
a single-language model. However, it is not feasible for us to train and
72-
maintain dozens of single-language model. Therefore, if your goal is to maximize
72+
maintain dozens of single-language models. Therefore, if your goal is to maximize
7373
performance with a language other than English or Chinese, you might find it
7474
beneficial to run pre-training for additional steps starting from our
7575
Multilingual model on data from your language of interest.
@@ -152,11 +152,9 @@ taken as the training data for each language
152152
However, the size of the Wikipedia for a given language varies greatly, and
153153
therefore low-resource languages may be "under-represented" in terms of the
154154
neural network model (under the assumption that languages are "competing" for
155-
limited model capacity to some extent).
156-
157-
However, the size of a Wikipedia also correlates with the number of speakers of
158-
a language, and we also don't want to overfit the model by performing thousands
159-
of epochs over a tiny Wikipedia for a particular language.
155+
limited model capacity to some extent). At the same time, we also don't want
156+
to overfit the model by performing thousands of epochs over a tiny Wikipedia
157+
for a particular language.
160158

161159
To balance these two factors, we performed exponentially smoothed weighting of
162160
the data during pre-training data creation (and WordPiece vocab creation). In

0 commit comments

Comments
 (0)