Skip to content

Commit e56c1ae

Browse files
committed
add:models
1 parent ea08eac commit e56c1ae

File tree

3 files changed

+31
-15
lines changed

3 files changed

+31
-15
lines changed

README.md

Lines changed: 31 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@
5353

5454
## Timeline of LLMs
5555

56-
![LLMs_timeline](assets/LLMs-0412.png)
56+
![LLMs_timeline](assets/LLMs-0418-final.png)
5757

5858
## List of LLMs
5959

@@ -71,7 +71,7 @@
7171
</thead>
7272
<tbody>
7373
<tr>
74-
<td class="tg-nrix" align="center" rowspan="22">Publicly <br>Accessbile</td>
74+
<td class="tg-nrix" align="center" rowspan="25">Publicly <br>Accessbile</td>
7575
<td class="tg-baqh" align="center">T5</td>
7676
<td class="tg-0lax" align="center">2019/10</td>
7777
<td class="tg-baqh" align="center">11</td>
@@ -131,6 +131,12 @@
131131
<td class="tg-baqh" align="center">175</td>
132132
<td class="tg-0lax" align="center"><a href="https://arxiv.org/abs/2205.01068">Paper</a></td>
133133
</tr>
134+
<tr>
135+
<td class="tg-baqh" align="center">YaLM</td>
136+
<td class="tg-0lax" align="center">2022/06</td>
137+
<td class="tg-baqh" align="center">100</td>
138+
<td class="tg-0lax" align="center"><a href="https://github.com/yandex/YaLM-100B">Github</a></td>
139+
</tr>
134140
<tr>
135141
<td class="tg-baqh" align="center">NLLB</td>
136142
<td class="tg-0lax" align="center">2022/07</td>
@@ -197,6 +203,18 @@
197203
<td class="tg-baqh" align="center">13</td>
198204
<td class="tg-0lax" align="center"><a href="https://vicuna.lmsys.org/">Blog</a></td>
199205
</tr>
206+
<tr>
207+
<td class="tg-baqh" align="center">ChatGLM</td>
208+
<td class="tg-0lax" align="center">2023/03</td>
209+
<td class="tg-baqh" align="center">6</td>
210+
<td class="tg-0lax" align="center"><a href="https://github.com/THUDM/ChatGLM-6B">Github</a></td>
211+
</tr>
212+
<tr>
213+
<td class="tg-baqh" align="center">CodeGeeX</td>
214+
<td class="tg-0lax" align="center">2023/03</td>
215+
<td class="tg-baqh" align="center">13</td>
216+
<td class="tg-0lax" align="center"><a href="https://arxiv.org/abs/2303.17568">Paper</a></td>
217+
</tr>
200218
<tr>
201219
<td class="tg-baqh" align="center">Koala</td>
202220
<td class="tg-0lax" align="center">2023/04</td>
@@ -323,12 +341,6 @@
323341
<td class="tg-baqh" align="center">54</td>
324342
<td class="tg-0lax" align="center"><a href="https://cohere.ai/">Homepage</a></td>
325343
</tr>
326-
<tr>
327-
<td class="tg-baqh" align="center">YaLM</td>
328-
<td class="tg-0lax" align="center">2022/06</td>
329-
<td class="tg-baqh" align="center">100</td>
330-
<td class="tg-0lax" align="center"><a href="https://github.com/yandex/YaLM-100B">Github</a></td>
331-
</tr>
332344
<tr>
333345
<td class="tg-baqh" align="center">AlexaTM</td>
334346
<td class="tg-0lax" align="center">2022/08</td>
@@ -395,6 +407,7 @@
395407

396408

397409

410+
398411
## Resources of LLMs
399412

400413
### Publicly Available Models
@@ -412,12 +425,13 @@
412425
11. <u>NLLB</u>: **"No Language Left Behind: Scaling Human-Centered Machine Translation"**. *NLLB Team.* arXiv 2022. [[Paper](https://arxiv.org/abs/2207.04672)] [[Checkpoint](https://github.com/facebookresearch/fairseq/tree/nllb)]
413426
12. <u>BLOOM</u>: **"BLOOM: A 176B-Parameter Open-Access Multilingual Language Model"**. *BigScience Workshop*. arXiv 2022. [[Paper](https://arxiv.org/abs/2211.05100)] [[Checkpoint](https://huggingface.co/bigscience/bloom)]
414427
13. <u>GLM</u>: **"GLM-130B: An Open Bilingual Pre-trained Model"**. *Aohan Zeng et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2210.02414)] [[Checkpoint](https://github.com/THUDM/GLM-130B)]
415-
13. <u>Flan-T5</u>: **"Scaling Instruction-Finetuned Language Models"**. *Hyung Won Chung et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2210.11416)] [[Checkpoint](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)]
416-
14. <u>mT0 && BLOOMZ</u>: **"Crosslingual Generalization through Multitask Finetuning"**. *Niklas Muennighoff et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2211.01786)] [[Checkpoint](https://github.com/bigscience-workshop/xmtf)]
417-
15. <u>Galactica</u>: **"Galactica: A Large Language Model for Science"**. *Ross Taylor et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2211.09085)] [[Checkpoint](https://huggingface.co/facebook/galactica-120b)]
418-
16. <u>OPT-IML</u>: **"OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization"**. *Srinivasan et al.* . arXiv 2022. [[Paper](https://arxiv.org/abs/2212.12017)] [[Checkpoint](https://huggingface.co/facebook/opt-iml-30b)]
419-
17. <u>Pythia</u>: **"Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling"**. *Stella Biderman et al.* . arXiv 2023. [[Paper](https://arxiv.org/abs/2304.01373)] [[Checkpoint](https://github.com/EleutherAI/pythia)]
420-
17. <u>LLaMA</u>: **"LLaMA: Open and Efficient Foundation Language Models"**. *Hugo Touvron et al.* arXiv 2023. [[Paper](https://arxiv.org/abs/2302.13971v1)] [[Checkpoint](https://github.com/facebookresearch/llama)]
428+
14. <u>Flan-T5</u>: **"Scaling Instruction-Finetuned Language Models"**. *Hyung Won Chung et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2210.11416)] [[Checkpoint](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)]
429+
15. <u>mT0 && BLOOMZ</u>: **"Crosslingual Generalization through Multitask Finetuning"**. *Niklas Muennighoff et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2211.01786)] [[Checkpoint](https://github.com/bigscience-workshop/xmtf)]
430+
16. <u>Galactica</u>: **"Galactica: A Large Language Model for Science"**. *Ross Taylor et al.* arXiv 2022. [[Paper](https://arxiv.org/abs/2211.09085)] [[Checkpoint](https://huggingface.co/facebook/galactica-120b)]
431+
17. <u>OPT-IML</u>: **"OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization"**. *Srinivasan et al.* . arXiv 2022. [[Paper](https://arxiv.org/abs/2212.12017)] [[Checkpoint](https://huggingface.co/facebook/opt-iml-30b)]
432+
18. <u>CodeGeeX</u>: **"CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X"**. *Qinkai Zheng et al.* . arXiv 2023. [[Paper](https://arxiv.org/abs/2303.17568)] [[Checkpoint](https://github.com/THUDM/CodeGeeX)]
433+
19. <u>Pythia</u>: **"Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling"**. *Stella Biderman et al.* . arXiv 2023. [[Paper](https://arxiv.org/abs/2304.01373)] [[Checkpoint](https://github.com/EleutherAI/pythia)]
434+
20. <u>LLaMA</u>: **"LLaMA: Open and Efficient Foundation Language Models"**. *Hugo Touvron et al.* arXiv 2023. [[Paper](https://arxiv.org/abs/2302.13971v1)] [[Checkpoint](https://github.com/facebookresearch/llama)]
421435

422436
### Closed-source Models
423437

@@ -786,5 +800,7 @@ The authors would like to thank Yankai Lin and Yutao Zhu for proofreading this p
786800
| V1 | 2023/03/31 | The initial version. |
787801
| V2 | 2023/04/09 | Add the affiliation information.<br/>Revise Figure 1 and Table 1 and clarify the <br/>corresponding selection criterion for LLMs.<br/>Improve the writing.<br/>Correct some minor errors. |
788802
| V3 | 2023/04/11 | Correct the errors for library resources. |
789-
| V4 | 2023/04/12 | Revise Figure 1 and Table 1, and clarify the release date of LLMs |
803+
| V4 | 2023/04/12 | Revise Figure 1 and Table 1 and clarify the release date of LLMs. |
804+
| V5 | 2023/04/16 | Add a new Section 2.2 about<br/>the technical evolution of GPT-series models. |
805+
| V6 | 2023/04/18 | Add some new models in<br/>Table 1 and Figure 1. |
790806

assets/LLMs-0412.png

-541 KB
Binary file not shown.

assets/LLMs-0418-final.png

728 KB
Loading

0 commit comments

Comments
 (0)