10.6084/m9.figshare.4585270.v1 Mahdi Jalili Mahdi Jalili Yasin Orouskhani Yasin Orouskhani Milad Asgari Milad Asgari Nazanin Alipourfard Nazanin Alipourfard Matjaž Perc Matjaž Perc The ESM zip contains three files. The file fedges.txt are the edges that define the network, the file tedges.txt are the edges between the different layers of the network, while data in the file twitter_foursquare_mapper.dat provides the basic info of each node of the network, as stated in the first row. from Link prediction in multiplex online social networks The Royal Society 2017 social networks complex networks multiplex networks signed networks link prediction SVM naive Bayes K-nearest neighbour machine learning 2017-01-25 14:56:51 Dataset https://rs.figshare.com/articles/dataset/The_ESM_zip_contains_three_files_The_file_fedges_txt_are_the_edges_that_define_the_network_the_file_tedges_txt_are_the_edges_between_the_different_layers_of_the_network_while_data_in_the_file_twitter_foursquare_mapper_dat_provides_the_basic_info_of_each_n/4585270 Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this manuscript, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.