Skip to content

Commit 7aff172

Browse files
committed
update README
1 parent 94527a0 commit 7aff172

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

torchtitan/models/deepseek_v3/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,16 @@ CONFIG_FILE="./torchtitan/models/deepseek_v3/train_configs/deepseek_v3_16b.toml"
3333
- Activation checkpointing
3434
- Tensor Parallel (TP)
3535
- Expert Parallel (EP)
36+
37+
38+
## To be added
39+
- Modeling
40+
- Merge DeepSeek-V3 and Llama4 MoE common components
41+
- Parallelism
42+
- Context Parallel support for DeepSeek-V3
43+
- PP support for DeepSeek-V3
44+
- torch.compile
45+
- Quantization
46+
- Testing
47+
- perfomance and loss converging tests
48+
- CI integration

0 commit comments

Comments
 (0)