Large-scale data management and analysis for astronomical research

Cheng Hsien Tang, Min Feng Wang, Wei Jen Wang, Meng Feng Tsai, Yuji Urata, Chow Choong Ngeow, Induk Lee, Kuiyun Huang

研究成果: 雜誌貢獻會議論文同行評審


The improvement of information technology enables precise scientific observation that demands larger storage and faster data processing techniques than ever before. From the perspective of astronomical research, one of the most important challenges is to extract useful astronomical information efficiently from a huge collection of observed data. Even though the existing distributed computing techniques, such as grid computing and cloud computing, have provided the scientists a better way to access powerful computing resources, the development of big-data management and analysis software is still lagging far behind. The awkward predicament obstructs the connected computing resources from being utilized efficiently. Therefore, it is beneficial to provide an integrated, efficient information management and analysis system for astronomical research. This research, conducted by the Pan-STARRS research team at Taiwan, focuses on the issues of integrating commercial data warehouse and large-scale grid computing techniques, and develops a system for efficient data management and fast analysis in astronomy-related fields. Our system can be viewed as a data grid system that supports analysis of large data collections. The system consists of two analytical sub-systems and one data presentation and management sub-system. The first one is called the PARallel Hierarchical Agglomerative Clustering System (PARHACS), which uses a distributed message-passing algorithm to efficiently calculate a hierarchical cluster, given a set of astronomical data. The second sub-system is called the SIMilarity Classification System (SIMCS), which uses a decentralized Multiple Classifier System (MCS) framework to support a complex classification procedure using multiple classifiers. The last sub-system is called the ASTROnomical Information Management System (ASTROIMS), which utilizes a multidimensional data-warehouse design to construct a more concise, integrated, and scalable platform for fast data retrieval and management. It is able to perform data maintenance procedures automatically and to reduce maintenance and operation costs easily. In addition, the sub-system provides a user-friendly interface to facilitate a variety of data analytical tasks on line.

期刊Proceedings of Science
出版狀態已出版 - 2011
事件1st International Symposium on Grids and Clouds, ISGC 2011, Held in Conjunction with the 31st Open Grid Forum, OGF 2011 - Taipei, Taiwan
持續時間: 19 3月 201125 3月 2011


深入研究「Large-scale data management and analysis for astronomical research」主題。共同形成了獨特的指紋。