A similarity-based method for retrieving documents from the SCI/SSCI database

Yen Liang Chen, Jhong Jhih Wei, Shin Yi Wu, Ya Han Hu

研究成果: 雜誌貢獻期刊論文同行評審

8 引文 斯高帕斯(Scopus)

摘要

As more and more documents become electronically available, finding documents in large databases that fit users' needs is becoming increasingly important. In the past, the document search problem was dealt with using the database query approach or the text-based search approach. In this paper, we investigate this problem, focusing on the SCI/SSCI databases from ISI. Specifically, we design our search methodology based on the four fields commonly seen in a scientific research document: abstract, title, keywords, and reference list. Of these four, only the abstract field can be viewed as a normal text, while the other three have their own characteristics to differentiate them from texts. Therefore, we first develop a method to compute the similarity value for each field. Our next problem is combining the four similarity values into a final value. One approach is to assign weights to each and compute the weighted sum. We have not adopted this simple weighting method, however, because it is difficult to determine appropriate weights. Instead, we use the back propagation neural network to combine them. Finally, extensive experiments have been carried out using real documents drawn from TKDE journal, and the results indicate that in all situations our method has a much higher accuracy than the traditional text-based search approach.

原文???core.languages.en_GB???
頁(從 - 到)449-464
頁數16
期刊Journal of Information Science
32
發行號5
DOIs
出版狀態已出版 - 10月 2006

指紋

深入研究「A similarity-based method for retrieving documents from the SCI/SSCI database」主題。共同形成了獨特的指紋。

引用此