-
Notifications
You must be signed in to change notification settings - Fork 424
Insights: pytorch/torchtitan
Overview
Could not load contribution data
Please try again later
8 Pull requests merged by 7 people
-
Refactor Tokenizer -> BaseTokenizer
#1333 merged
Jul 10, 2025 -
Migrating flux checkpointing to hf api
#1377 merged
Jul 10, 2025 -
[SimpleFSDP] Add support for hsdp+tp
#1343 merged
Jul 10, 2025 -
basic validator implementation
#1362 merged
Jul 10, 2025 -
[float8 moe] Support TP
#1375 merged
Jul 9, 2025 -
call init_weights before generation
#1371 merged
Jul 8, 2025 -
fix: correct CONFIG_FILE path in multinode_trainer script
#1374 merged
Jul 8, 2025 -
dp2ep Expert Parallel
#1324 merged
Jul 8, 2025
4 Pull requests opened by 4 people
-
Add option to exclude low flop mms from every-other-mm sac policy
#1372 opened
Jul 8, 2025 -
[DSV3] Adding deepseek-v3 model into torchtitan
#1373 opened
Jul 8, 2025 -
[llama3] add configurations for Llama 3 1B and 3B models
#1376 opened
Jul 9, 2025 -
add float8 support
#1378 opened
Jul 10, 2025
2 Issues closed by 1 person
-
[Evaluation] Minimal support for downstream tasks
#883 closed
Jul 10, 2025 -
[Bug] Potential bugs in "_grouped_mm" in Llama4 MoE codes
#1237 closed
Jul 8, 2025
3 Issues opened by 3 people
-
Llama4 conversion scripts error on aux-loss-free params
#1379 opened
Jul 10, 2025 -
Puzzling collectives in TP ( SP to be exact)
#1369 opened
Jul 7, 2025 -
Support more models such as Qwen
#1368 opened
Jul 4, 2025
14 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[DSV3] Add PP support for DSV3
#1345 commented on
Jul 10, 2025 • 8 new comments -
Add AMD GPU node for integration test
#1241 commented on
Jul 10, 2025 • 7 new comments -
OOM recovery under multi-node FSDP/HSDP
#1329 commented on
Jul 7, 2025 • 0 new comments -
DeepSeek V3 Support
#760 commented on
Jul 8, 2025 • 0 new comments -
Llama 4 issue tracking
#1118 commented on
Jul 8, 2025 • 0 new comments -
Data loader's state_dict is being lost between being saved and loaded when the dataset loops
#1357 commented on
Jul 9, 2025 • 0 new comments -
How to adapt HuggingFace or other models for TorchTitan
#1322 commented on
Jul 10, 2025 • 0 new comments -
Issue reproducing Float8 performance benchmark
#1344 commented on
Jul 10, 2025 • 0 new comments -
[RFC] validation and evaluation in torchtitan
#1210 commented on
Jul 10, 2025 • 0 new comments -
compile: turn off fullgraph=True to support llama4
#1182 commented on
Jul 8, 2025 • 0 new comments -
Enable ROCm CI support.
#1260 commented on
Jul 9, 2025 • 0 new comments -
Add support for saving HF format tensors with DCP
#1351 commented on
Jul 11, 2025 • 0 new comments -
[benchmark] add h200 bench
#1361 commented on
Jul 7, 2025 • 0 new comments -
[WIP] Compile for dp2ep
#1365 commented on
Jul 10, 2025 • 0 new comments