MingOfficial: A Ming Official Career Dataset and a Historical Context-Aware Representation Learning Framework

You Jun Chen, Hsin Yi Hsieh, Yu Tung Lin, Yingtao Tian, Bert Wang Chak Chan, Yu Sin Liu, Yi Hsuan Lin, Richard Tzong Han Tsai

研究成果: 書貢獻/報告類型會議論文篇章同行評審

摘要

In Chinese studies, understanding the nuanced traits of historical figures, often not explicitly evident in biographical data, has been a key interest. However, identifying these traits can be challenging due to the need for domain expertise, specialist knowledge, and context-specific insights, making the eprocess time-consuming and difficult to scale. Our focus on studying officials from China's Ming Dynasty is no exception. To tackle this challenge, we propose MingOfficial, a large-scale multi-modal dataset consisting of both structured (career records, annotated personnel types) and text (historical texts) data for 13, 031 officials. We further couple the dataset with a graph neural network (GNN) to combine both modalities in order to allow investigation of social structures and provide features to boost down-stream tasks. Experiments show that our proposed MingOfficial could enable exploratory analysis of official identities, and also significantly boost performance in tasks such as identifying nuance identities (e.g. civil officials holding military power) from 24.6% to 98.2% F1 score in holdout test set. By making MingOfficial publicly available at https://data.depositar.io/en/dataset/ming_official as both a dataset and an interactive tool, we aim to stimulate further research into the role of social context and representation learning in identifying individual characteristics, and hope to provide inspiration for computational approaches in other fields beyond Chinese studies.

原文???core.languages.en_GB???
主出版物標題EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings
編輯Houda Bouamor, Juan Pino, Kalika Bali
發行者Association for Computational Linguistics (ACL)
頁面4380-4401
頁數22
ISBN(電子)9798891760608
出版狀態已出版 - 2023
事件2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 - Hybrid, Singapore, Singapore
持續時間: 6 12月 202310 12月 2023

出版系列

名字EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023
國家/地區Singapore
城市Hybrid, Singapore
期間6/12/2310/12/23

指紋

深入研究「MingOfficial: A Ming Official Career Dataset and a Historical Context-Aware Representation Learning Framework」主題。共同形成了獨特的指紋。

引用此