Effective web crawling for chinese addresses and associated information

Hsiu Min Chuang, Chia Hui Chang, Ting Yao Kao

研究成果: 書貢獻/報告類型會議論文篇章同行評審

4 引文 斯高帕斯(Scopus)

摘要

With the advance of wireless networks, location-based services have become very important as people often need to query for addresses of unfamiliar locations through Web and then locate the position on the map. Existing geographic information systems based on crowd-sourcing are insufficient and have a slow update progress. However, it can actually be complemented by automatically extracting addresses of location entities and associated information from general pages. Thus, effectively crawling webpages with addresses is a practical challenge for enriching the location entity database. This research is devoted to automatic address and associated information extraction to provide information retrieval on maps, i.e. integrating the process of location entity query on Web and positioning on maps. We build a geographic information system of location entities by crawling the Web via three strategies for Chinese addresses. One point two seven (1.27) million distinct Chinese addresses are crawled using 1.08 million HTTP requests, leading to a return-of-investment of 1.169.

原文???core.languages.en_GB???
主出版物標題E-Commerce and Web Technologies - 15th International Conference, EC-Web 2014, Proceedings
編輯Martin Hepp, Yigal Hoffner
發行者Springer Verlag
頁面13-25
頁數13
ISBN(電子)9783319104904
DOIs
出版狀態已出版 - 2014
事件15th International Conference on E-Commerce and Web Technologies, EC-Web 2014 - Munich, Germany
持續時間: 1 9月 20144 9月 2014

出版系列

名字Lecture Notes in Business Information Processing
188
ISSN(列印)1865-1348
ISSN(電子)1865-1356

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???15th International Conference on E-Commerce and Web Technologies, EC-Web 2014
國家/地區Germany
城市Munich
期間1/09/144/09/14

指紋

深入研究「Effective web crawling for chinese addresses and associated information」主題。共同形成了獨特的指紋。

引用此