-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix bare except clause in timers.py
bug
Something isn't working
#1857
opened Oct 12, 2025 by
vediyappanm
Loading…
7 tasks done
Update model_parallel_config.py
bug
Something isn't working
#1832
opened Sep 28, 2025 by
skirdey-inflection
Loading…
remove comments that is not correct anymore
module: documentation
module: transformer engine
#1812
opened Sep 18, 2025 by
cyr0930
Loading…
Fix _set_wandb_writer serialization issues
bug
Something isn't working
module: debugging
#1806
opened Sep 11, 2025 by
gakkiri
Loading…
5 of 8 tasks
Add support for packing-with-padding in gpt-dataset
enhancement
New feature or request
module: data pipeline
#1788
opened Sep 3, 2025 by
terminator123
Loading…
Add falcon h1 2
enhancement
New feature or request
#1785
opened Sep 2, 2025 by
dhiaEddineRhaiem
Loading…
bugfix: raise error if eos_token is not set in tokenizer
bug
Something isn't working
module: data pipeline
#1774
opened Aug 27, 2025 by
imomayiz
Loading…
Fix Context Parallel NaN Loss
bug
Something isn't working
#1765
opened Aug 21, 2025 by
leoleoasd
Loading…
perf(MoE): Use TE quant/dequant for SwiGLU fp8 input store to improve performance and stability
enhancement
New feature or request
module: transformer engine
#1753
opened Aug 19, 2025 by
xiaoxi-wangfj
Loading…
[main][feature][under updating]zero-overhead activation offload
enhancement
New feature or request
#1752
opened Aug 18, 2025 by
GeYuhong
Loading…
fix: Initialize master_weight with params_dtype directly
bug
Something isn't working
#1748
opened Aug 15, 2025 by
Mirza-Samad-Ahmed-Baig
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.