Current web technology has brought us a scenario that information about a certain topic is widely dispersed in data from different domains and data modalities, such as texts and images from news and social media. Automatic extraction of the most informative and important multimedia summary (e.g. a ranked list of inter-connected texts and images) from massive amounts of cross-media and cross-genre data can significantly save users' time and effort that is consumed in browsing. In this paper, we propose a novel method to address this new task based on automatically constructed Multi-media Information Networks (MiNets) by incorporating cross-genre knowledge and inferring implicit similarity across texts and images. The facts from MiNets are exploited in a novel random walk-based algorithm to iteratively propagate ranking scores across multiple data modalities. Experimental results demonstrated the effectiveness of our MiNets-based approach and the power of cross-media cross-genre inference.
|Number of pages
|Published - 2014
|3rd Annual Meeting of the EPSRC Network on Vision and Language and 1st Technical Meeting of the European Network on Integrating Vision and Language, V and L Net 2014 - Dublin, Ireland
Duration: 23 Aug 2014 → …
|3rd Annual Meeting of the EPSRC Network on Vision and Language and 1st Technical Meeting of the European Network on Integrating Vision and Language, V and L Net 2014
|23/08/14 → …