## 摘要

Egocentric network (ego network) data are very important for evaluating algorithms and machine learning approaches in Online Social Networks (OSNs). Nevertheless, obtaining the ego network data from OSNs is not a trivial task. Conventional manual approaches are time-consuming, and sometimes the ego network data are quite incomplete because only a small number of users would agree to provide their data. This is because there are two important factors that should be considered simultaneously for this data acquisition task: i) users' willingness to provide their data, and ii) the structure of the ego network. However, addressing the above two factors to obtain the more complete ego network data has not received much research attention. Therefore, in this paper, we address this issue by proposing a family of new research problems. The first proposed problem, named <italic>Willingness Maximization for Ego Network Extraction in Online Social Networks (WMEgo)</italic>, identifies a set of ego networks from a single OSN, such that the willingness of the users to provide their data is maximized. We prove that WMEgo is NP-hard and propose a <inline-formula><tex-math notation="LaTeX">$\frac{1}{2}(1-\frac{1}{e})$</tex-math></inline-formula>-approximation algorithm, named <italic>Ego Network Identification with Maximum Willingness (EIMW)</italic>. Furthermore, we extend the idea of WMEgo to multiple social networks and formulate a new research problem, named <italic>Willingness Maximization on Multiple Social Networks for Ego Network Extraction (WM <inline-formula><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula> Ego)</italic>, which is able to effectively obtain ego network data from multiple social networks <italic>simultaneously</italic>. We propose a <inline-formula><tex-math notation="LaTeX">$\frac{1}{2}$</tex-math></inline-formula>-approximation algorithm, named <italic><underline>M</underline>aximum Expansion for <underline>UN</underline>ified <underline>EX</underline>penses (MUNEX)</italic> for a special case of WM <inline-formula><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula> Ego and then design a constant-ratio approximation algorithm to the general WM <inline-formula><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula> Ego problem, named <italic>Maximum Expansion with Expense Examination (M3E)</italic>. We conduct two evaluation studies with 672 and 1052 volunteers to validate the proposed WMEgo and WM <inline-formula><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula> Ego problems, respectively, and show that they are able to obtain much more complete ego network data compared to other baselines. We also perform extensive experiments on multiple real datasets to demonstrate that the proposed approaches significantly outperform the other baselines.

原文 | ???core.languages.en_GB??? |
---|---|

頁（從 - 到） | 1-14 |

頁數 | 14 |

期刊 | IEEE Transactions on Knowledge and Data Engineering |

DOIs | |

出版狀態 | 已被接受 - 2022 |