Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
portfolio
publications
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue
Published in ACL 2023 Main Conference, 2023
A method for teaching future knowledge to pre-trained language models for task-oriented dialogue.
Recommended citation: Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu. (2023). "FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue." ACL 2023 Main Conference.
Download Paper
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation
Published in ACL 2023 Main Conference, 2023
Exploring compositional generalization of multi-attribute controllable dialogue generation.
Recommended citation: Weihao Zeng, Lulu Zhao, Keqing He, Ruotong Geng, Jingang Wang, Wei Wu, Weiran Xu. (2023). "Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation." ACL 2023 Main Conference.
Download Paper
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Published in ICLR 2024, 2024
A comprehensive study of automatic data selection in instruction tuning, introducing the Deita framework.
Recommended citation: Wei Liu*, Weihao Zeng*, Keqing He, Yong Jiang, Junxian He. (2024). "What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning." ICLR 2024.
Download Paper
Automatic Instruction Evolving for Large Language Models
Published in EMNLP 2024, 2024
A method for automatically evolving instructions for large language models.
Recommended citation: Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen. (2024). "Automatic Instruction Evolving for Large Language Models." EMNLP 2024.
Download Paper
7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient
Published in Preprint, 2025
Demonstrating that emerging reasoning with reinforcement learning is both effective and efficient using a 7B model and 8K examples.
Recommended citation: Weihao Zeng*, Yuzhen Huang*, Wei Liu, Keqing He, Qian Liu, Zejun Ma, Junxian He. (2025). "7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient." Preprint.
Download Paper
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Published in ICLR 2025, 2025
A method for monitoring and balancing exploration and exploitation in self-taught reasoners.
Recommended citation: Weihao Zeng*, Yuzhen Huang*, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He. (2025). "B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners." ICLR 2025.
Download Paper
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Published in Preprint, 2025
A deep investigation of zero RL training across diverse model families and sizes.
Recommended citation: Weihao Zeng*, Yuzhen Huang*, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He. (2025). "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild." Preprint.
Download Paper
talks
SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient
Published:
Invited talk on SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient
Published:
Invited talk on SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient
Published:
Invited talk on SimpleRL: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Published:
Invited talk on SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild.
SimpleRL-Zoo and B-STaR: Improving reasoning performance and efficiency through reinforcement learning
Published:
Invited talk on SimpleRL-Zoo and B-STaR: Improving reasoning performance and efficiency through reinforcement learning.
