2026年4月团队成果被电子商务类期刊Electronic Commerce Research录用;
点击次数:
基础信息
标题:Automatic Tagging Services for Marketer-generated Contents: A Multimodal Neural Network
作者:钱洋, 汪思鹏,徐旺,姜元春*, 柴一栋, 凌海峰
发表期刊/来源:Electronic Commerce Research
链接:
摘要
This study improves tag recommendation for marketer-generated contents (MGCs) by integrating multimodal information from both textual descriptions and visual information. To address this issue, we develop a behavior-driven multimodal neural network that jointly captures content semantics and marketer-specific tagging preferences. Specifically, we first introduce a text-guided attention mechanism to model the interaction between the textual contents and images in MGCs. Second, we incorporate internal attention modules to identify the impacts of specific words, image regions, and individual images on tag prediction. Third, we design a time-decaying attention mechanism to account for marketers’ historical tagging behaviors, thereby capturing the temporal dynamics and heterogeneity in their preferences. We evaluate our model using a large-scale real-world dataset collected from Taobao. Quantitative results demonstrate that our approach significantly outperforms state-of-the-art baselines in tag recommendation. Qualitative analyses further reveal interpretable insights into how marketers’ historical behaviors and multimodal cues contribute to their tagging decisions.
中文翻译:
本研究通过整合文本描述与视觉信息的多模态数据,提升了对营销人员生成内容(MGCs)的标签推荐效果。为解决这一问题,我们提出了一种基于行为驱动的多模态神经网络模型,能够同时捕捉内容语义和营销人员个体的标签偏好。具体而言,首先,我们引入一种文本引导的注意力机制,用于建模文本内容与图像之间的交互关系。其次,我们设计了内部注意力模块,以识别特定词语、图像区域以及单个图像在标签预测中的影响。第三,我们提出时间衰减注意力机制,用以刻画营销人员历史标注行为,从而捕捉其偏好的时间动态变化与个体异质性。我们基于来自淘宝的大规模真实数据集对模型进行了评估。定量实验结果表明,该方法在标签推荐任务上显著优于当前最先进的基线模型。定性分析进一步揭示了营销人员历史行为和多模态信息在其标签决策过程中的可解释性作用。
