A formal concept analysis-based domain-specific thesaurus and its application in document representation

Jihn Chang Jehng, Shihchieh Chou, Chin Yi Cheng

研究成果: 書貢獻/報告類型會議論文篇章同行評審

1 引文 斯高帕斯(Scopus)

摘要

Many techniques in the process of document retrieval and clustering, based on the vector space model, represent documents by vectors. They ignore the conceptual relationships of terms such as synonyms, hypernyms and hyponyms and, especially, treat terms as a bag of terms. The application of conceptual relationships of terms has been proved by generating improved results for document clustering in previous studies. For those studies, thesauri like WordNet were used to provide the information of relationships between terms. However, some domain-specific terms like "query expansion" and "document clustering" cannot be found in these thesauri. These terms are thought of as important features in domain-specific documents. In this paper, we propose an automatic domain-specific thesaurus building approach based on Formal Concept Analysis (FCA) dealing with the problem with general thesauri. We also apply the domain-specific thesaurus as background knowledge to represent documents by concept dimension vectors. In the evaluation, an improved result by our method compared to traditional approaches is shown.

原文???core.languages.en_GB???
主出版物標題Computational Science and Its Applications - ICCSA 2010 - International Conference, Proceedings
發行者Springer Verlag
頁面431-442
頁數12
版本PART 3
ISBN(列印)3642121780, 9783642121784
DOIs
出版狀態已出版 - 2010
事件2010 International Conference on Computational Science and Its Applications, ICCSA 2010 - Fukuoka, Japan
持續時間: 23 3月 201026 3月 2010

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
號碼PART 3
6018 LNCS
ISSN(列印)0302-9743
ISSN(電子)1611-3349

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???2010 International Conference on Computational Science and Its Applications, ICCSA 2010
國家/地區Japan
城市Fukuoka
期間23/03/1026/03/10

指紋

深入研究「A formal concept analysis-based domain-specific thesaurus and its application in document representation」主題。共同形成了獨特的指紋。

引用此