Experience: Analyzing Missing Web Page Visits and Unintentional Web Page Visits from the Client-side Web Logs

Che Yun Hsu, Ting Rui Chen, Hung Hsuan Chen

研究成果: 雜誌貢獻期刊論文同行評審

1 引文 斯高帕斯(Scopus)


Web logs have been widely used to represent the web page visits of online users. However, we found that web logs in Chrome's browsing history only record 57% of users' visited websites, i.e., nearly half of a user's website visits are not recorded. Additionally, 5.1% of the visits recorded in the web log occur because of unconscious user actions, i.e., these page visits are not initiated from users. We created a Google Chrome plugin and recruited users to install the plugin to collect and analyze the conscious URL visits, unconscious URL visits, and "missing"URL visits (i.e., the visits unrecorded in the traditional web log). We reported the statistics of these behaviors. We showed that sorting popular website categories based on traditional web logs differs from the rankings obtained when including missing visits or excluding unintentional visits. We predicted users' future behaviors based on three types of training data - all the visits in modern web logs, the intentional visits in web logs, and the intentional visits plus missing visits in web logs. The experimental results indicate that missing visits in web logs may contain additional information, and unintentional visits in web logs may contain more noise than information for user modeling. Consequently, we need to be careful of the observations and Conclusions: derived from web log analyses because the web log data could be an incomplete and noisy dataset of a user's visited web pages.

期刊Journal of Data and Information Quality
出版狀態已出版 - 6月 2022


深入研究「Experience: Analyzing Missing Web Page Visits and Unintentional Web Page Visits from the Client-side Web Logs」主題。共同形成了獨特的指紋。