You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+13-7Lines changed: 13 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -98,7 +98,8 @@ Throughout the entire training process, we did not experience any irrecoverable
98
98
99
99
</div>
100
100
101
-
**NOTE: The total size of DeepSeek-V3 models on Hugging Face is 685B, which includes 671B of the Main Model weights and 14B of the Multi-Token Prediction (MTP) Module weights.**
101
+
> [!NOTE]
102
+
> The total size of DeepSeek-V3 models on Hugging Face is 685B, which includes 671B of the Main Model weights and 14B of the Multi-Token Prediction (MTP) Module weights.**
102
103
103
104
To ensure optimal performance and flexibility, we have partnered with open-source communities and hardware vendors to provide multiple ways to run the model locally. For step-by-step guidance, check out Section 6: [How_to Run_Locally](#6-how-to-run-locally).
104
105
@@ -151,8 +152,9 @@ For developers looking to dive deeper, we recommend exploring [README_WEIGHTS.md
151
152
152
153
</div>
153
154
154
-
Note: Best results are shown in bold. Scores with a gap not exceeding 0.3 are considered to be at the same level. DeepSeek-V3 achieves the best performance on most benchmarks, especially on math and code tasks.
155
-
For more evaluation details, please check our paper.
155
+
> [!NOTE]
156
+
> Best results are shown in bold. Scores with a gap not exceeding 0.3 are considered to be at the same level. DeepSeek-V3 achieves the best performance on most benchmarks, especially on math and code tasks.
157
+
> For more evaluation details, please check our paper.
156
158
157
159
#### Context Window
158
160
<palign="center">
@@ -193,10 +195,11 @@ Evaluation results on the ``Needle In A Haystack`` (NIAH) tests. DeepSeek-V3 pe
Note: All models are evaluated in a configuration that limits the output length to 8K. Benchmarks containing fewer than 1000 samples are tested multiple times using varying temperature settings to derive robust final results. DeepSeek-V3 stands as the best-performing open-source model, and also exhibits competitive performance against frontier closed-source models.
197
-
198
198
</div>
199
199
200
+
> [!NOTE]
201
+
> All models are evaluated in a configuration that limits the output length to 8K. Benchmarks containing fewer than 1000 samples are tested multiple times using varying temperature settings to derive robust final results. DeepSeek-V3 stands as the best-performing open-source model, and also exhibits competitive performance against frontier closed-source models.
202
+
200
203
201
204
#### Open Ended Generation Evaluation
202
205
@@ -213,9 +216,11 @@ Note: All models are evaluated in a configuration that limits the output length
213
216
| Claude-Sonnet-3.5-1022 | 85.2 | 52.0 |
214
217
| DeepSeek-V3 |**85.5**|**70.0**|
215
218
216
-
Note: English open-ended conversation evaluations. For AlpacaEval 2.0, we use the length-controlled win rate as the metric.
217
219
</div>
218
220
221
+
> [!NOTE]
222
+
> English open-ended conversation evaluations. For AlpacaEval 2.0, we use the length-controlled win rate as the metric.
223
+
219
224
220
225
## 5. Chat Website & API Platform
221
226
You can chat with DeepSeek-V3 on DeepSeek's official website: [chat.deepseek.com](https://chat.deepseek.com/sign_in)
0 commit comments