唐申庚  

硕士生导师

学历:博士研究生毕业

办公地点:翡翠湖校区科教楼A904

学科:计算机应用技术

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

详见谷歌学术主页:https://scholar.google.com/citations?user=_JZcsnYAAAAJ

[2025年]

[1] Shengeng Tang, Jiayi He, Lechao Cheng, Jingjing Wu, Dan Guo, Richang Hong. Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. (CCF A

[2] Shuoyan Wei, Feng Li, Shengeng Tang, Yao Zhao, Huihui Bai. EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. (CCF A, Highlight Paper, Top 3.0%

[3] Shengeng Tang, Jiayi He, Dan Guo, Yanyan Wei, Feng Li, Richang Hong. Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production. AAAI Conference on Artificial Intelligence (AAAI), 2025. (CCF AOral Paper, Top 4.6%) [Link][PDF]

[4] Ziheng Zhou, Jinxing Zhou, Wei Qian, Shengeng Tang, Xiaojun Chang, Dan Guo. Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration. AAAI Conference on Artificial Intelligence (AAAI), 2025. (CCF A) [Link][PDF]

[5] Wei Qian, Gaoji Su, Dan Guo, Jinxing Zhou, Xiaobai Li, Bin Hu, Shengeng Tang, Meng Wang. PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement. AAAI Conference on Artificial Intelligence (AAAI), 2025. (CCF AOral Paper, Top 4.6%) [Link][PDF]

[6] Zhangbin Li, Jinxing Zhou, Jing Zhang, Shengeng Tang, Kun Li, Dan Guo. Patch-level Sounding Object Tracking for Audio-Visual Question Answering. AAAI Conference on Artificial Intelligence (AAAI), 2025. (CCF A) [Link][PDF]

[7] Zhenqiang Zhang, Kun Li, Shengeng Tang, Yanyan Wei, Fei Wang, Jinxing Zhou, Dan Guo. Temporal Boundary Awareness Network for Repetitive Action Counting. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2024. (CCF B, SCI 1区, IF=5.2) [Link][PDF]

[8] Xu Wang, Shengeng Tang*, Peipei Song, Shuo Wang, Dan Guo, Richang Hong. Linguistics-Vision Monotonic Consistent Network for Sign Language Production. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025. (CCF B, 清华大学顶尖级国际会议) [*Corresponding author] [Link][PDF]

[9] Jiaqi Zhao, Fei Wang, Kun Li, Yanyan Wei, Shengeng Tang, Shu Zhao, Xiao Sun. Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025. (CCF B) [Link][PDF]

[10] Kezhou Chen, Shuo Wang, Huixia Ben, Shengeng Tang, Yanbin Hao. Mixture of Multimodal Adapters for Sentiment Analysis. North American Chapter of the Association for Computational Linguistics (NAACL), 2025. (CCF B) 

[11] Jiayi He, Shengeng Tang*, Ao Liu, Lechao Cheng, Jingjing Wu, Yanyan Wei. Efficient Vision Language Model Fine-tuning for Text-based Person Anomaly Search. ACM Web Conference Workshop on Multimedia Object Re-ID (WWW-MORE), 2025. (CCF A Workshop) [*Corresponding author] [PDF]

[12] Chenglong Xu, Peipei Song, Shengeng Tang, Dan Guo, Xun Yang. Alleviating Confirmation Bias in Learning with Noisy Labels via Two-Network Collaboration. ACM Transactions on Intelligent Systems and Technology (TIST), 2025. (CAA A, SCI 1区, IF=7.2)


[2024年]

[1] Peipei Song, Dan Guo, Xun Yang, Shengeng Tang, Meng Wang. Emotional Video Captioning with Vision-based Emotion Interpretation Network. IEEE Transactions on Image Processing (TIP), 2024, 33: 1122-1135. (CCF A, 中科院1区, IF=10.8) [Link][PDF]

[2] Shengeng Tang, Feng Xue, Jingjing Wu, Shuo Wang, Richang Hong .Gloss-driven Conditional Diffusion Models for Sign Language Production. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2024. (CCF B, SCI 1区, IF=5.2) [Link][PDF]

[3] Jingjing Wu, Richang Hong, Shengeng Tang. Intermediary-Generated Bridge Network for RGB-D Cross-modal Re-identification. ACM Transactions on Intelligent Systems and Technology (TIST), 2024, 15(6): 1-25. (CAA A, SCI 1区, IF=7.2) [Link][PDF]


[2023年及之前]

[1] Shengeng Tang, Richang Hong, Dan Guo, Meng Wang .Gloss Semantic-Enhanced Network with Online Back-Translation for Sign Language Production. ACM International Conference on Multimedia (ACM MM), 2022: 5630-5638. (CCF A) [Link][PDF]

[2] Shengeng Tang, Dan Guo, Richang Hong, Meng Wang. Graph-Based Multimodal Sequential Embedding for Sign Language Translation. IEEE Transactions on Multimedia (TMM), 2022, 24: 4433-4445. (CAAI A, CCF B, 中科院1区, IF=8.4) [Link][PDF]

[3] Peipei Song, Dan Guo, Xun Yang, Shengeng Tang, Erkun Yang, Meng Wang. Emotion-Prior Awareness Network for Emotional Video Captioning. ACM International Conference on Multimedia (ACM MM), 2023: 589-600. (CCF AOral Paper, Top 5.4%) [Link][PDF]

[4] Dan Guo, Shengeng Tang, Meng Wang. Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling. International Joint Conference on Artificial Intelligence (IJCAI), 2019: 751-757. (CCF A) [Link][PDF]

[4] Dan Guo, Shengeng Tang, Richang Hong, Meng Wang. Sign Language Recognition. Multimedia for Accessible Human Computer Interfaces. Springer, Cham, 2021: 23-59. [Link][PDF]