TY - JOUR
T1 - Enhancing POI search on maps via online address extraction and associated information segmentation
AU - Chang, Chia Hui
AU - Chuang, Hsiu Min
AU - Huang, Chia Yi
AU - Su, Yueng Sheng
AU - Li, Shu Ying
N1 - Publisher Copyright:
© 2015, Springer Science+Business Media New York.
PY - 2016/4/1
Y1 - 2016/4/1
N2 - With the popularity of wireless networks and mobile devices, we have seen rapid growth in mobile applications and services, especially location-based services. However, most existing location-based services like Google Maps and Wikimapia rely on crowd-sourcing or business-data providers to maintain their points-of-interest (POI) databases, which are slow and insufficient. Because most updated information can be found on the Web, the insufficiency of current POI databases can be complemented by automatically extracting POIs and their descriptions from general webpages. In this study, we enhance location-based search on maps via online address extraction and associated information segmentation. Given a POI query that cannot be found on a map, we propose a method for extracting the address from search snippets of the query to exploit information from the Web. We demonstrate the application of sequence labeling to Chinese postal-address extraction and compare the performance with and without Chinese word segmentation. Meanwhile, we also present a novel algorithm for associated information segmentation by making use of a document-object model (DOM) tree structure based on the farthest distinguishable ancestor (FDA) of each address. The FDA algorithm is able to locate associated information for each Chinese address resulting in an improvement from an F-measure of 0.811 to 0.964.
AB - With the popularity of wireless networks and mobile devices, we have seen rapid growth in mobile applications and services, especially location-based services. However, most existing location-based services like Google Maps and Wikimapia rely on crowd-sourcing or business-data providers to maintain their points-of-interest (POI) databases, which are slow and insufficient. Because most updated information can be found on the Web, the insufficiency of current POI databases can be complemented by automatically extracting POIs and their descriptions from general webpages. In this study, we enhance location-based search on maps via online address extraction and associated information segmentation. Given a POI query that cannot be found on a map, we propose a method for extracting the address from search snippets of the query to exploit information from the Web. We demonstrate the application of sequence labeling to Chinese postal-address extraction and compare the performance with and without Chinese word segmentation. Meanwhile, we also present a novel algorithm for associated information segmentation by making use of a document-object model (DOM) tree structure based on the farthest distinguishable ancestor (FDA) of each address. The FDA algorithm is able to locate associated information for each Chinese address resulting in an improvement from an F-measure of 0.811 to 0.964.
KW - Associated information segmentation
KW - Chinese postal address extraction
KW - Conditional random field
KW - Location-based service
KW - Record boundary detection
UR - http://www.scopus.com/inward/record.url?scp=84960393570&partnerID=8YFLogxK
U2 - 10.1007/s10489-015-0707-5
DO - 10.1007/s10489-015-0707-5
M3 - 期刊論文
AN - SCOPUS:84960393570
SN - 0924-669X
VL - 44
SP - 539
EP - 556
JO - Applied Intelligence
JF - Applied Intelligence
IS - 3
ER -