Previous worked in Colossalai, ByteDance, Temu. Master of NUS, bachelor of SYSU.
Focus on AI Infra.
-
colossalai
- Singapore
-
-
-
-
ColossalAI Public
Forked from hpcaitech/ColossalAIMaking large AI models cheaper, faster and more accessible
-
Pytorch-profile Public
Use pytorch profile api to further analysis the training detailed information, like heaps and stacks, time consuming.
-
-
-
-
BandWidth_Test Public
Test the GPU bandwidth of collectives operators like all-reduce, all-gather, broadcast and all-to-all primitives on single-node multi-GPU (2, 4, 8 cards) and multi-node multi-GPU (16 cards) setups,…
-
Finetune_llama2_Megatron Public
Using megatron style to do TP training.
-
Finetune_llama2 Public
Build a llama fine-tuning script from scratch using PyTorch and transformers API. It needs to support 4 optional features: gradient checkpointing, mixed precision, data parallelism, tensor parallel…



