Hi, welcome to my page! I am a Research Scientist at Meta. Prior to this, I was a postdoctoral associate at the Department of Electrical and Computer Engineering, University of Minnesota, mentored by Prof. Mingyi Hong and Prof. Shuzhong Zhang. I obtained my Ph.D. degree in Applied Mathematics at the Department of Mathematics, UC Davis, advised by Prof. Krishna Balasubramanian and Prof. Shiqian Ma. Prior to this, I received my B.S. degree in Mathematics from Zhejiang University. My CV is here.
My research develops optimization principles and algorithms for training modern machine learning systems, with a particular focus on foundation models. I aim to understand and improve large-scale pretraining and alignment under practical constraints such as computation, memory, communication, and data quality. A central theme of my work is to bridge rigorous optimization theory with the algorithmic and systems challenges that arise in modern AI.
News
- May 2026Two papers are accepted by ICML 2026. Congratulations to all my collaborators!
- A minimalist optimizer design for LLM pretraining
- A Tale of Two Problems: Multi-Task Bilevel Learning Meets Equality Constrained Multi-Objective Optimization
- January 2026Paper Muon Outperforms Adam in Tail-End Associative Memory Learning is accepted by ICLR 2026. Congratulations to all my collaborators!
- August 2025Starting my new job as Research Scientist at Meta!
- February 2025Paper Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization is accepted by Pacific Journal of Optimization.
- January 2025Paper Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback is accepted by ICLR 2025 (Spotlight). Congratulations to all my collaborators!
- January 2025Paper Riemannian Bilevel Optimization is accepted by Journal of Machine Learning Research!
- November 2024Paper A Riemannian ADMM is accepted by Mathematics of Operations Research!
- September 2024Two papers are accepted by NeurIPS 2024. Congratulations to all my collaborators!
- Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining
- August 2024I'm very happy to receive the INFORMS Computing Society Prize!
- August 2024Paper Zeroth-order Riemannian Averaging Stochastic Approximation Algorithms is accepted by SIAM Journal on Optimization!
- July 2024A new grant Bi-Level Optimization for Hierarchical Machine Learning Problems: Models, Algorithms and Applications is awarded from NSF. I'm excited to be the co-PI of this project with Prof Hong!
- May 2024Paper Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark is accepted by ICML 2024. Congratulations to all my collaborators!
