I am Pengpeng Zeng (曾鹏鹏), currently a Researcher in the School of Computer Science and Technology at Tongji University, China. I received my Ph.D. degree in 2023 from the School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), under the supervision of Prof. Jingkuan Song, Prof. Lianli Gao, and Prof. Heng Tao Shen.
My research interests include Machine Learning, Deep Learning, AI-Generated Content (AIGC), Computer Vision, and Reinforcement Learning etc.
If you are interested in related topics or potential collaborations, please feel free to get in touch: is.pengpengzeng@gmail.com.
🔥 News
- 2025.05: Two papers accepted by ACL 2025!
- 2025.05: One paper accepted by TIP!
- 2025.02: One paper accepted by TIP!
- 2024.07: One paper accepted by ACM MM 2024!
- 2024.07: One paper accepted by TCSVT!
- 2024.02: One paper accepted by CVPR 2024!
📝 Publications
Selected publications are listed below. For the full list, please see my Google Scholar.

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
Shihan Wu, Ji Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Heng Tao Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Paper][Code]

Text-Video Retrieval with Global-Local Semantic Consistent Learning
Haonan Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Yihang Duan, Xinyu Lyu, Heng Tao Shen
IEEE Transactions on Image Processing (TIP), 2025
[Paper][Code]

Multi-Concept Learning for Scene Graph Generation
Xinyu Lyu, Lianli Gao, Junlin Xie, Pengpeng Zeng, Yulu Tian, Jie Shao, Heng Tao Shen
IEEE Transactions on Image Processing (TIP), 2025
[Paper][Code]

ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Kaipeng Fang, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Zhi-Qi Cheng, Xiyao Li, Heng Tao Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper][Code]

Adaptive Fine-Grained Predicates Learning for Scene Graph Generation
Xinyu Lyu, Lianli Gao, Pengpeng Zeng, Heng Tao Shen, Jingkuan Song
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
[Paper][Code]

Visual Commonsense-aware Representation Network for Video Captioning
Pengpeng Zeng, Haonan Zhang, Lianli Gao, Xiangpeng Li, Jin Qian, Heng Tao Shen
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
[Paper][Code]

Video Question Answering with Prior Knowledge and Object-sensitive Learning
Pengpeng Zeng, Haonan Zhang, Lianli Gao, Jingkuan Song, Heng Tao Shen
IEEE Transactions on Image Processing (TIP), 2022
[Paper][Code]

S2 Transformer for Image Captioning
Pengpeng Zeng, Haonan Zhang, Jingkuan Song, Lianli Gao
International Joint Conference on Artificial Intelligence (IJCAI), 2022
[Paper][Code]
-
arXiv 2025
Towards Generalized and Training-Free Text-Guided Semantic Manipulation, Yu Hong, Xiao Cai, Pengpeng Zeng, Shuai Zhang, Jingkuan Song, Lianli Gao, Heng Tao Shen. -
arXiv 2025
CFReID: Continual Few-shot Person Re-Identification, Hao Ni, Lianli Gao, Pengpeng Zeng, Heng Tao Shen, Jingkuan Song. [Code] -
arXiv 2024
GT23D-Bench: A Comprehensive General Text-to-3D Generation Benchmark, Sitong Su, Xiao Cai, Lianli Gao, Pengpeng Zeng, Qinhong Du, Mengqi Li, Heng Tao Shen, Jingkuan Song. -
arXiv 2024
SeMv-3D: Towards Semantic and Mutil-view Consistency simultaneously for General Text-to-3D Generation with Triplane Priors, Xiao Cai, Pengpeng Zeng, Lianli Gao, Junchen Zhu, Jiaxin Zhang, Sitong Su, Heng Tao Shen, Jingkuan Song. -
ACL 2025
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction, Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, Qiang Qu, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li. [Code] -
ACL 2025 (findings)
MMEvol: Empowering multimodal large language models with evol-instruct, Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li. [Project][Code] -
ACM MM 2024
MPT: Multi-grained Prompt Tuning for Text-Video Retrieval, Haonan Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Heng Tao Shen. [Code] -
TCSVT 2024
UMP: Unified Modality-aware Prompt Tuning for Text-Video Retrieval, Haonan Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Heng Tao Shen. [Code] -
ACM MM 2024
Depth-aware sparse transformer for video-language learning, Haonan Zhang, Lianli Gao, Pengpeng Zeng, Alan Hanjalic, Heng Tao Shen. [Code] -
TCSVT 2023
SPT: Spatial pyramid transformer for image captioning, Haonan Zhang, Pengpeng Zeng, Lianli Gao, Xinyu Lyu, Jingkuan Song, Heng Tao Shen. [Code] -
TMM 2023
Memory-based augmentation network for video captioning, Shuaiqi Jing, Haonan Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Heng Tao Shen. [Code] -
NeurIPS 2022
A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval, Hao Li, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Haonan Zhang, Gongfu Li. [Code] -
TIP 2021
Hierarchical representation network with auxiliary tasks for video captioning and video question answering, Lianli Gao, Yu Lei, Pengpeng Zeng, Jingkuan Song, Meng Wang, Heng Tao Shen. [Code] -
ACM MM 2021
(oral) Conceptual and Syntactical Cross-modal Alignment with Cross-level Consistency for Image-Text Matching, Pengpeng Zeng, Lianli Gao, Xinyu Lyu, Shuaiqi Jing, Jingkuan Song.
🎖 Honors and Services
-
Honors
- 2022.12 Outstanding Student of UESTC.
- 2022.10 National Scholarship.
-
Academic Services
- IEEE TPAMI, IEEE TIP, IEEE TMM, IEEE TNNLS, ICCV, CVPR, ECCV, AAAI, MM, etc.
-
Grand Challenges
- ICME 2024: 1st Place Winner on Attribute Recognition track of Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC)
- ICME 2024: 1st Place Winner on Person Reidentification track of Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC)
- Ingenuity Cup 2024: 2nd Place award in the Grand Final of the 1st “Ingenuity Cup” National Artificial Intelligence Innovation Application Competition.
- Ingenuity Cup 2024: 1st Place award on the Multimodal Technology, Technology Innovation Track of the 1st “Ingenuity Cup” National Artificial Intelligence Innovation Application Competition.
- ECCV 2022: DeeperAction Challenge 3rd place award on Track 4 Kinetics-TPS Challenge on Part-level Action Parsing.
- OPPO 2021: Security Challenge 3rd Place award.
📖 Educations
- 2019.09 - 2023.06, Ph.D, Computer Science and Technology, University of Electronic Science and Technology of China (UESTC).
- 2016.09 - 2019.06, M.S., Computer Technology, University of Electronic Science and Technology of China (UESTC).
- 2012.09 - 2016.06, B.S., Digital Media Technology, Xi’an University of Technology (XUT).
💻 Research Grants
- 2025.01 - 2027.12, Young Scientists Fund of the National Natural Science Foundation of China: “Research on Efficient Cross-Modal Retrieval Theory and Technology for Open Scenarios”, Lead PI
- 2025.01 - 2026.12, Young Scientists Fund of the Sichuan Provincial Department of Science and Technology: “Theoretical and Methodological Study on Efficient Adaptive Learning for Open-Scenario Multimodal Autonomous Intelligence”, Lead PI
- 2022.04 - 2023.11, Sichuan Science and Technology Program of the Sichuan Provincial Department of Science and Technology: “Research on Vision-Text Collaborative Learning for Semantic Understanding”, Lead PI