My name is Yan Zhuang. I am a Research Engineer at Tencent, where I work on Large Multimodal Models (LMMs) for Healthcare Intelligence and Neural Search. Previously, I received my Ph.D. in Computer Science and Technology from University of Electronic Science and Technology of China (UESTC), advised by Prof. Fuji Ren. Before that, I obtained my M.S. in Computer Technology from UESTC, under the supervision of Prof. Yanru Zhang, and my B.S. in Information and Computing Science from Anhui Science and Technology University.

My research focuses on Large Multimodal Models (LMMs) and their real-world applications. My interests include Multimodal Affective Computing, Multimodal Mathematical Reasoning, and Healthcare Intelligence. More broadly, I am interested in developing reliable, efficient, and trustworthy multimodal intelligent systems capable of understanding, reasoning, and interacting in complex real-world environments.

    
      Multimodal Affective Intelligence
    
Multimodal sentiment and emotion understanding

Robust multimodal learning under
missing, noisy, and incomplete modalities

Representation learning, multimodal alignment, and efficient fusion

      Multimodal Reasoning & Foundation Models
    
Large multimodal models (LMMs)

Multimodal mathematical reasoning and self-verifiable inference

Reasoning, planning, and reinforcement learning for multimodal agents

      Healthcare Intelligence
    
Medical multimodal intelligence

Clinical decision support and healthcare applications with LMMs

Featured Publications

(*denotes joint first-authors. Representative publications on multimodal representation learning, robust multimodal intelligence, large multimodal models, and trustworthy reasoning. Full publication list is available on Google Scholar.)

ACL Main 2026

Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation

Wentao Zhang,Yan Zhuang,Zhuhang Zheng,Mingfei Zhang,Jiawen Deng,Fuji Ren

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL Main 2026)

Our work investigates previously overlooked soft-failure behaviors in retrieval-augmented generation systems and introduces a new benchmark for evaluating trustworthy multimodal reasoning.

Paper

AAAI 2026

TMDC: A Two-Stage Modality Denoising and Complementation Framework for Multimodal Sentiment Analysis with Missing and Noisy Modalities

Yan Zhuang*, Minhao Liu*, Yanru Zhang, Jiawen Deng, Fuji Ren

In The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)

Paper Code

NeurIPS 2025

Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Yan Zhuang*, Minhao Liu*, Wei Bai, Yanru Zhang, Wei Li, Jiawen Deng, Fuji Ren

In The 38th Conference on Neural Information Processing Systems (NeurIPS 2025)

Paper Code

ICCV 2025

CMAD: Correlation-Aware and Modalities-Aware Distillation for Multimodal Sentiment Analysis with Missing Modalities

Yan Zhuang, Minhao Liu, Wei Bai, Yanru Zhang, Xiaoyue Zhang, Jiawen Deng, Fuji Ren

In The IEEE/CVF International Conference on Computer Vision (ICCV 2025)

Paper Code

IEEE TMM 2025

Intra-sample and Intra-modal Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Yan Zhuang, Yanru Zhang, Jiawen Deng, Fuji Ren

IEEE Transactions on Multimedia (TMM 2025)

Paper Code

ICMR 2026 ReNoRD: Learning from Relations under Noisy Pseudo Labels via Relational Distillation for Multimodal Sentiment. Tiantai Zhai, Yan Zhuang, Fuji Ren, Jiawen Deng, Liang Luo.
Neurocomputing 2026 Decoupled Hypergraph Modeling for Multimodal Sentiment Analysis. Yanping Huang, Jiawen Deng, Yan Zhuang, Jiali You, Qian Liu, Fuji Ren.
ACM MM 2025 FAME: Fusion-Aware Multi-modal Ensemble for Social Media Popularity Prediction. Yan Zhuang, Wei Bai, Yanru Zhang, Minhao Liu, Jiawen Deng, Fuji Ren.
IEEE TAFFC 2025 Enhanced Emotion Recognition in Conversations through Hybrid Context Encoding and Latent Dependency Mining. Zheng Hu, Jiawen Deng, Satoshi Nakagawa, Yan Zhuang, Xiaoyue Zhang, Shimin Cai, Fuji Ren.
IEEE TMM 2025 Multi-Level Contrastive Learning for Multimodal Sentiment Analysis. Yan Zhuang, Wei Bai, Yanru Zhang, Jiawen Deng, Zheng Hu, Xiaoyue Zhang, Fuji Ren.
Research 2025 R3DG: Retrieve, Rank and Reconstruction with Different Granularities for Multimodal Sentiment Analysis. Yan Zhuang, Yanru Zhang, Jiawen Deng, Fuji Ren.
WWW 2025 ETS-MM: A Multi-Modal Social Bot Detection Model Based on Enhanced Textual Semantic Representation. Wei Li, Jiawen Deng, Jiali You, Yuanyuan He, Yan Zhuang, Fuji Ren.
ACM MM 2024 GLoMo: Global-local modal fusion for multimodal sentiment analysis. Yan Zhuang, Yanru Zhang, Zheng Hu, Xiaoyue Zhang, Jiawen Deng, Fuji Ren.
IEEE TKDE 2024 Hierarchical denoising for robust social recommendation. Zheng Hu, Satoshi Nakagawa, Yan Zhuang, Jiawen Deng, Shimin Cai, Tao Zhou, Fuji Ren.

Research Projects

My research has evolved around three interconnected directions: multimodal representation learning, robust multimodal intelligence, and large multimodal models for reasoning and real-world applications.

Robust Multimodal Intelligence

Building robust multimodal learning frameworks capable of handling missing modalities, noisy observations, and incomplete multimodal information. This research line focuses on modality denoising, adaptive fusion, representation enhancement, and knowledge distillation.

Representative works: TMDC (AAAI 2026), HME (NeurIPS 2025), CMAD (ICCV 2025)

Multimodal Representation Learning

Developing effective multimodal representation learning methods through contrastive learning, cross-modal alignment, global-local interaction, and relational modeling to improve multimodal understanding.

Representative works: GLoMo (ACM MM 2024), MLCL (TMM 2025), IIE (TMM 2025), ReNoRD (ICMR 2026)

Large Multimodal Models

Exploring reasoning, verification, and practical deployment of large multimodal models, with applications to mathematical reasoning, healthcare intelligence, and trustworthy AI systems.

Current topics include: RLVR, Medical LMMs, Trustworthy Reasoning

Professional Experience

My research experience spans both academia and industry, with a primary focus on multimodal intelligence, large multimodal models, and real-world AI systems.

Jul. 2026 – Present Research Engineer: Tencent

Working on large multimodal models for healthcare intelligence and next-generation AI search systems. My current research focuses on multimodal reasoning, trustworthy AI, reinforcement learning for reasoning, and practical deployment of multimodal foundation models in real-world applications.

Jan. 2026 – Jun. 2026 Research Intern: Tencent

Conducted research on reinforcement learning for multimodal mathematical reasoning and trustworthy large multimodal models, with applications to healthcare intelligence.

Jan. 2022 – Jun. 2022 Research Intern: NetEase FUXI Laboratory

Conducted research on large language model pre-training and efficient language representation learning, laying the foundation for subsequent research on multimodal foundation models.

Education

Sep. 2022 – Jun. 2026 Ph.D. in Computer Science and Technology
University of Electronic Science and Technology of China (UESTC), Advisor: Prof. Fuji Ren
Sep. 2019 – Jun. 2022 M.S. in Computer Technology
University of Electronic Science and Technology of China (UESTC),Advisor: Prof. Yanru Zhang
Sep. 2015 – Jun. 2019 B.S. in Information and Computing Science
Anhui Science and Technology University

Honors and Awards

Academic Honors

UESTC Outstanding Graduate, 2026
UESTC Academic Newcomer Award, 2026
National Scholarship (Ph.D.), 2025
National Scholarship (B.S.), 2017

Competition Awards

Best Performance Award, ACM Multimedia 2025 Social Media Prediction Challenge (Image Track)
Silver Award, China International College Students’ “Internet+” Innovation and Entrepreneurship Competition, 2021
Second Prize, China Postgraduate Mathematical Contest in Modeling (Huawei Cup), 2020

Professional Activities

Journal Reviewing

IEEE Transactions on Multimedia (TMM 2025-2026)
IEEE Transactions on Affective Computing (TAFFC 2026)
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT 2026)
Transactions on Machine Learning Research (TMLR 2026)
Knowledge-Based Systems (KBS 2026)
IEEE Transactions on Vehicular Technology (TVT 2023-2024)

Conference Reviewing

CVPR 2026
ICML 2026 (Gold Reviewer Award, Top 25%)
NeurIPS 2026

Thank you for visiting my homepage. I am always happy to discuss research collaborations, academic exchanges, and opportunities related to multimodal intelligence, large multimodal models, and trustworthy AI.

Yan Zhuang

Multimodal Affective Intelligence

Multimodal Reasoning & Foundation Models

Healthcare Intelligence

Recent News

Featured Publications

Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation

TMDC: A Two-Stage Modality Denoising and Complementation Framework for Multimodal Sentiment Analysis with Missing and Noisy Modalities

Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities

CMAD: Correlation-Aware and Modalities-Aware Distillation for Multimodal Sentiment Analysis with Missing Modalities

Intra-sample and Intra-modal Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Research Projects

Robust Multimodal Intelligence

Multimodal Representation Learning

Large Multimodal Models

Professional Experience

Jul. 2026 – Present Research Engineer: Tencent

Jan. 2026 – Jun. 2026 Research Intern: Tencent

Jan. 2022 – Jun. 2022 Research Intern: NetEase FUXI Laboratory

Education

Honors and Awards

Academic Honors

Competition Awards

Professional Activities

Journal Reviewing

Conference Reviewing