跨媒体分析的理论和方法

2016-05-30卢汉清刘静黄萱菁

科技创新导报 2016年1期

卢汉清刘静黄萱菁

摘要：该年度的研究主要围绕多媒体对象的多粒度语义分析与关联挖掘等方面展开，考察媒体对象与语义标签关联矩阵的纵横不同视角，充分发掘底层特征之间的关联性，语义特征之间的关联性，以及媒体对象在底层特征空间相似度和语义标签空间相似度的一致性，同时注重与应用背景的紧密结合，力争将研究成果做实做细。依照“课题计划任务书及其后3年调整方案”要求，该课题在多媒体异构特征拓扑结构分析、媒体对象的多粒度语义解析等方面取得了突破性进展，课题整体进展顺利，已完成本年度计划的各项预期目标。在媒体数据的层次化语义分析方面，重点关注社会标签在媒体信息理解任务中的重要作用，引入标签行为的参与者（即用户）以及地理位置等多属性信息，以提高社会媒体网站中多媒体对象的语义理解性能。同时，我们还在多媒体内容的细粒度语义解析方面展开研究工作。在基于语义的媒体内容检索与应用方面，重点考虑媒体数据的多模态与多关联特性，在已取得层次化语义分析成果的基础上，进一步关注用户对媒体检索的高、精、准的实际需求，力图实现网络媒体数据检索的快速性与准确性，并结合实际应用开发了相关的检索服务原型系统。

关键词：跨媒体多粒度语义分析关联挖掘

Abstract：Our work in this year focuses on the multi-granularity semantic analysis and correlation mining for the multimedia information. We attempt to utilize correlations within low-level and high-level features， and their similarity consistence to better understand multimedia objects. Our project goes well， and has reached the goals of this year. There are totally 33 publications in this year， in which 18 papers are published on international journals or transactions （SCI indexed）， and 15 papers are published on international conferences （e.g.， ACM Multimedia， ICCV， CIKM， and CVPR， EI indexed ）. Besides， we have one authorized patent and two pending patents. In the following， we will introduce our finished work in this year in details. （1）Multimedia feature representation and correlation construction：We have proposed a set of effective methods to solve the problem of the multi-modal feature fusion and selection when given a large-scale， noisy， and high dimensional multimedia dataset. One is the multi-view learning approach considering the consistency and complementarity of different features， one is the sub-space learning based robust feature selection， and the other is the topological feature structure analysis. The related works have published on important journals of TNNLS and CVIU， and top conferences of ACM MM and WWW， etc. （2）Hierarchical semantic analysis of multimedia data：We attempt to semantically understand multimedia data （video and image） from different semantic levels including low-level visual appearance， object part， object， and scene. To this goal， we utilized the important role of social tags to enhance the performance of multimedia semantic understanding. Other relevant attributes to social tags， i.e.， tagging users and geographic positions， are also considered for the task. The related works have published on important journals of TMM， Pattern Recognition， and TALSP， and top conferences of CVPR， ICCV， and ICME， etc. （3）Semantic retrieval and other applications：To integrate and verify our proposed approaches in the project， we attempt to develop and design some prototype systems for multimedia retrieval. The systems can meet user real requirements in retrieval process. The related works have published on important journals of TKDE， TOMCCAP， and TMM， and top conferences of ACM MM， WWW， and ICIP， etc.

Key Words：Cross-Media；Multi-granularity Semantic Analysis；Correlation Mining

阅读全文链接（需实名注册）：http：//www.nstrs.cn/xiangxiBG.aspx？id=51008&flag=1