TY - JOUR
T1 - Web-scale multimedia information networks
AU - Qi, Guo Jun
AU - Tsai, Min Hsuan
AU - Tsai, Shen Fu
AU - Cao, Liangliang
AU - Huang, Thomas S.
N1 - Funding Information:
Manuscript received June 1, 2011; revised February 25, 2012 and April 27, 2012; accepted May 9, 2012. Date of publication July 25, 2012; date of current version August 16, 2012. This work was supported by the Army Research Laboratory and was accomplished under Cooperative AgreementW911NF-09-2-0053. This work was also supported in part by a Beckman Institute (Illinois) Seed Grant, the National Science Foundation (NSF) under Grant IIS 1049332 EAGER, the HP Innovation Research Program, and the National Science Council, Taiwan under Contract NSC-095-SAF-I-564-035-TMS (to S.-F. Tsai). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on. G.-J. Qi, M.-H. Tsai, S.-F. Tsai, and T. S. Huang are with the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). L. Cao is with the IBM T. J. Watson Research Center, Hawthorne, NY 10598 USA (e-mail: [email protected]).
Funding Information:
Guo-Jun Qi received the B.S. degree in automa- tion from the University of Science and Technol- ogy of China, Hefei, Anhui, China, in 2005. He has been with the Beckman Institute and the Department of Electrical and Computer Engineer- ing, University of Illinois at Urbana-Champaign, Urbana, since 2009. His research interests include pattern recognition, machine learning, computer vision, and multimedia. Mr. Qi is the winner of the best paper award at the 15th ACM International Conference on Multimedia, Augsburg, Germany, 2007. In 2011, he was a recipient of the IBM Ph.D. fellowship award. He has served as a program committee member and reviewer in many academic conferences and journals in the fields of computer vision, pattern recognition, machine learning, and multimedia.
PY - 2012
Y1 - 2012
N2 - The abundance of multimedia data on the Web presents both challenges (how to annotate, search, and mine) and opportunities (crawling the Web to create large structured multimedia data bases which can be used to do inference effectively). Because of the huge data volume, considering all semantic concepts as on the same (flat) level is not viable. In this paper, we introduce a unified STRUCTURED representation called multimedia information networks (MINets), which incorporates ontology and cross-media links, covering both content and context knowledge. Ontology and cross-media structures are constructed and expanded by automatically constructing MINets from web-scale data by state-of-the-art information extraction and knowledge-based population techniques. The resultant MINet will contain a wide range of linkages, including logical, statistical, and semantic relations among informative concept nodes, which connects proliferative ontology as well as cross-media web-scale resources together. The raw data collected in construction phase often contain much noisy, incomplete, or even conflicting information which could be detrimental to information extraction and utilization. Then, the redundant link structure can be utilized to distill MINets and improve quality of information (QoI). Moreover, advanced inference theory and system can be built upon the linked MINets, and then high-level ontological knowledge can be inferred and integrated in a logically harmonious network structure in MINets which is consistent with human cognition. Even more, as information channels, the ontology and cross-media links in MINets connect informative knowledge resources together, which makes it possible to increase the portability of information between different resources to increase information utilization levels.
AB - The abundance of multimedia data on the Web presents both challenges (how to annotate, search, and mine) and opportunities (crawling the Web to create large structured multimedia data bases which can be used to do inference effectively). Because of the huge data volume, considering all semantic concepts as on the same (flat) level is not viable. In this paper, we introduce a unified STRUCTURED representation called multimedia information networks (MINets), which incorporates ontology and cross-media links, covering both content and context knowledge. Ontology and cross-media structures are constructed and expanded by automatically constructing MINets from web-scale data by state-of-the-art information extraction and knowledge-based population techniques. The resultant MINet will contain a wide range of linkages, including logical, statistical, and semantic relations among informative concept nodes, which connects proliferative ontology as well as cross-media web-scale resources together. The raw data collected in construction phase often contain much noisy, incomplete, or even conflicting information which could be detrimental to information extraction and utilization. Then, the redundant link structure can be utilized to distill MINets and improve quality of information (QoI). Moreover, advanced inference theory and system can be built upon the linked MINets, and then high-level ontological knowledge can be inferred and integrated in a logically harmonious network structure in MINets which is consistent with human cognition. Even more, as information channels, the ontology and cross-media links in MINets connect informative knowledge resources together, which makes it possible to increase the portability of information between different resources to increase information utilization levels.
KW - Multimedia information networks
KW - web-scale multimedia content
UR - http://www.scopus.com/inward/record.url?scp=84865424520&partnerID=8YFLogxK
U2 - 10.1109/JPROC.2012.2201909
DO - 10.1109/JPROC.2012.2201909
M3 - 期刊論文
AN - SCOPUS:84865424520
SN - 0018-9219
VL - 100
SP - 2688
EP - 2704
JO - Proceedings of the IEEE
JF - Proceedings of the IEEE
IS - 9
M1 - 6248667
ER -