Weihao Zeng Zeng-WH

I am Weihao Zeng, an incoming PhD student supervised by Prof. Junxian He at the Hong Kong University of Science and Technology starting in the fall of 2025.

My main focus is on the post-training aspect of LLMs, specifically including:

Improving model reasoning capabilities using reinforcement learning (RL) / self-evolution techniques (SimpleRL, B-STaR)
Exploring efficient data engineering methods for post-training (Deita, Auto Evol-Instruct)
The application of LLMs in task-oriented dialogue systems (FutureTOD, Seen2UnSeen)

Feel free to email me for any form of academic cooperation: [email protected]

🔥 News

2025-03: We introduce SimpleRL-Zoo, a deep investigation of zero RL training across diverse model families and sizes! SimpleRL-Zoo Twitter
2025-01: Announce our latest effort on O/R-1 Style Model and Scalable Reinforcement Learning for LLM Reasoning! SimpleRL Twitter
2025-01: 🎉🎉 Our B-STaR have been accepted by ICLR 2025!
2024-09: 🎉🎉 Our Auto Evol-Instruct have been accepted by EMNLP 2024!
2024-01: 🎉🎉 Our Deita have been accepted by ICLR 2024!
2023-05: 🎉🎉 Two paper have been accepted by ACL 2023!

📝 Publications

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Weihao Zeng*, Yuzhen Huang*, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He

Preprint SimpleRL-Zoo Github
7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient

Weihao Zeng*, Yuzhen Huang*, Wei Liu, Keqing He, Qian Liu, Zejun Ma, Junxian He

Preprint SimpleRL Twitter Github
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Weihao Zeng*, Yuzhen Huang*, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He

ICLR 2025 paper
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue

Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu

ACL 2023 Main Conference paper
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation

Weihao Zeng, Lulu Zhao, Keqing He, Ruotong Geng, Jingang Wang, Wei Wu, Weiran Xu

ACL 2023 Main Conference paper
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Wei Liu*, Weihao Zeng*, Keqing He, Yong Jiang, Junxian He

ICLR 2024 paper
Automatic Instruction Evolving for Large Language Models

Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen

EMNLP 2024 paper

Full Publications on Google Scholar

🎖 Competitions and Awards

National Scholarship in China (2019/2023)
2022-09: 🏆🏆 Achieved the 1st Award on SereTOD Challenge 2022 track 2, EMNLP 2022!
2021-08: 🏆🏆 Achieved the 4th Award on SMP 2021 Conversational AI Challenge!
2021-09: 🏆🏆 Achieved the 8th Place on CCIR 2021 Intelligent NLU Challenge!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weihao Zeng Zeng-WH

Achievements

Achievements

Block or report Zeng-WH

🔥 News

📝 Publications

🎖 Competitions and Awards

Pinned Loading