上一条:Jia Li, Yin Chen, Xuesong Zhang, et al. Multimodal feature extraction and fusion for emotional reaction intensity estimation and expression classification in videos with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2023: 5837-5843.
下一条:Tang Z, Hao Y, Li J, et al. FTCM: Frequency-Temporal Collaborative Module for Efficient 3D Human Pose Estimation in Video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023.