Yihua Zhang

vanishing_me.jpg

Room 3210

428 S Shaw LN

East Lansing, Michigan

United States of America

Yihua Zhang (张逸骅) is a fourth-year Ph.D. student at OPTML Group at Michigan State University, under the supervision of Prof. Sijia Liu. His research centers on trustworthy and scalable machine learning (ML) algorithms for large language models (LLMs), multi-modal language models (MLLM), and diffusion models (DMs), with a keen focus on bridging theoretical foundations and real-world applications. In recognition of his outstanding contributions, Yihua was honored with the IBM PhD Fellowship 2024, the CPAL 2025 Risiting Star Award hosted by at Stanford Data Science, and the prestigious MLCommons Rising Star Award in 2024 hosted by NVIDIA. Yihua has gained valuable industry experience through internships at leading technology companies such as Bytedance Seed, Meta AI, Amazon AWS AI Lab, and Cisco Research. Yihua’s work is driven by the need to develop efficient, scalable, and robust ML algorithms, with a commitment to addressing modern challenges in these domains.

:heavy_check_mark: Industry: Multimodal Modeling and Large-Scale Pretraining for LLMs and VLMs
Yihua’s industrial research experience spans both Meta AI and Bytedance Seed, where he contributed to developing the next generation of large-scale multimodal foundation models.

  • At Meta, he built and deployed SOTA fusion and alignment algorithms across 8–10 distinct modalities, which have been integrated into Meta’s internal production systems, achieving remarkable progress in multimodal ads ranking and unified modeling. His work led to the design and scalable training of industry-level multimodal foundation models that combine vision, text, audio, tabular, time-series, and structured data within a unified framework
  • At Bytedance Seed, he focused on token-efficient VLM pretraining and better modality alignment.
  • Yihua gained extensive large-scale distributed training experience, working on multi-node (32–128 nodes, >256–1024 A100/H100) systems, leveraging advanced frameworks, such as Megatron and FSDP2, with a deep understanding of parallel strategies including TP (tensor), PP (pipeline), SP (sequence), and EP (expert) parallelism. These experiences have strengthened his ability to bridge algorithmic innovation with production-grade deployment of multimodal foundation models at scale.

:heavy_check_mark: Theme 1: Trustworthy Foundation Models: Robustness, Fairness, and Unlearning: Yihua explores how to enhance the trustworthiness of foundation models, focusing on robustness against adversarial attacks, fairness in decision-making, and the emerging area of machine unlearning to ensure data privacy and compliance with deletion requests.

:heavy_check_mark: Theme 2: Scalable Foundation Models: Efficient Models, Data, and Algorithms: In this theme, Yihua’s work revolves around designing models that are not only powerful but also computationally efficient. His research includes advancements in model sparsification, memory-efficient fine-tuning techniques, and optimizing data usage for large-scale models.

:heavy_check_mark: Theme 3: Optimization in Modern ML: Bi-Level and Zeroth-Order Optimization This research line focuses on the theoretical underpinnings of scalable machine learning algorithms, addressing real-world constraints through bi-level optimization and zeroth-order optimization.

News

Jan 26, 2026 :tada: Three papers accepted at ICLR 2026!
Jan 21, 2026 :tada: One first-authored paper accepted in AISTATS 2026!
Sep 21, 2025 :tada: Two first-authored papers accepted in NeurIPS 2025, including one spotlight!
May 16, 2025 :tada: One first-authored paper SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs? is accepted to ACL 2025 main conference!
May 1, 2025 :tada: Two papers accepted in ICML’25!
Apr 16, 2025 :medal_sports: Honored to receive the First Place Award in the 2024–25 Fitch H. Beach Award competition — the highest distinction for graduate students at the MSU College of Engineering! I’ll be proudly representing the Computer Science department at the college-wide awards ceremony on April 30. 🎓💚
Feb 26, 2025 :tada: My co-first-authored paper Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing is accepted to CVPR 2025</a>! Congratulations to my summer intern Hanhui!
Jan 22, 2025 :tada: Our paper When is Task Vector Provably Effective Model Editing? A Generalization Analysis of Nonlinear Transformers is accepted to ICLR 2025 as an Oral Presentation (only 1.8% acceptance rate)!
Jan 21, 2025 :tada: I am awarded with the CPAL Rising Star Award 2025 and will give a presentation at Stanford in March 2025.!
Jan 20, 2025 :tada: My new technical post From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning (千呼万唤始出来:DeepSeek-R1 如何通过强化学习实现复杂推理) is now online! English and Chinese versions both available!