Empirical comparision of local structural similarity indices for collaborative-filtering-based recommender systems Collaborative filtering is one of the most successful recommendation techniques, which can effectively predict the possible future likes of users based on their past preferences. The key problem of this method is how to define the similarity between users. A standard approach is using the correlation between the ratings that two users give to a set of objects, such as Cosine index and Pearson correlation coefficient. However, the costs of computing this kind of indices are relatively high, and thus it is impossible to be applied in the huge-size systems. To solve this problem, in this paper, we introduce six local-structure-based similarity indices and compare their performances with the above two benchmark indices. Experimental results on two data sets demonstrate that the structure-based similarity indices overall outperform thePearson correla-tion coefficient. When the data is dense, the structure-based indices can perform competitively good asCosine index, while with lower computational complexity. Furthermore, when the data is sparse, the structure-based indices give even better results thanCosine index.
Q.-M Zhang, M.-S. Shang, W. Zeng, Y. Chen, L. Lü* -
Physics Procedia 3, 1887 (2010) -
Can Dissimilar Users Contribute to Accuracy and Diversity of Personalized Recommendation? Recommender systems are becoming a popular and important set of personalization
techniques that assist individual users with navigating through the rapidly growing
amount of information. A good recommender system should be able to not only nd
out the objects preferred by users, but also help users in discovering their personalized
tastes. The former corresponds to high accuracy of the recommendation, while the latter
to high diversity . A big challenge is to design an algorithm that provides both highly
accurate and diverse recommendation. Traditional recommendation algorithms only take
into account the contributions of similar users, thus, they tend to recommend popular
items for users ignoring the diversity of recommendations. In this paper, we propose
a recommendation algorithm by considering both the eects of similar and dissimilar
users under the framework of collaborative ltering. Extensive analyses on three datasets,
namely MovieLens, Netflix and Amazon, show that our method performs much better
than the standard collaborative ltering algorithm for both accuracy and diversity .
W. Zeng, M.-S. Shang, Q.-M. Zhang, L. Lü*, T. Zhou -
Int. J. Mod. Phys. C 21, 1217 (2010) -
Similarity-Based Classification in Partially Labeled Networks We propose a similarity-based method, using the similarity between nodes, to ad-dress the problem of classification in partially labeled networks. The basic assump-tion is that two nodes are more likely to be categorized into the same class if they are more similar. In this paper, we introduce ten similarity indices, including five local ones and five global ones. Empirical results on the co-purchase network of political books show that the similarity-based method can give high accurate clas-sification even when the labeled nodes are sparse which is one of the difficulties in classification. Furthermore, we find that when the target network has many labeled nodes, the local indices can perform as good as those global indices do, while when the data is spares the global indices perform better. Besides, the similarity-based method can to some extent overcome the unconsistency problem which is another difficulty in classification.
Q.-M. Zhang, M.-S. Shang, L. Lü* -
Int. J. Mod. Phys. C 21, 813 (2010) -
Empirical analysis of web-based user-object bipartite networks Understanding the structure and evolution of web-based user-object networks is a
significant task since they play a crucial role in e-commerce nowadays. This Letter reports the empirical analysis on two large-scale web sites, audioscrobbler.com and del.icio.us, where users are connected with music groups and bookmarks, respectively. The degree distributions and degree-degree correlations for both users and objects are reported. We propose a new index, named collaborative clustering coefficient, to quantify the clustering behavior based on the collaborative selection. Accordingly, the clustering properties and clustering-degree correlations are investi-gated. We report some novel phenomena well characterizing the selection mechanism of web users and outline the relevance of these phenomena to the information recommendation problem.
M.-S. Shang, L. Lü, Y.-C. Zhang, T. Zhou -
EPL 90, 48006 (2010) -
Link prediction in weighted networks: the role of weak ties Plenty of algorithms for link prediction have been proposed and were applied to various real networks. Among these algorithms, the weights of links are rarely taken into account. In this letter, we use local similarity indices to estimate the likelihood of the existence of links in
weighted networks, includingCommon Neighbor, Adamic-Adar Index, Resource Allocation Index, and their weighted versions. We have tested the prediction accuracy on real social, technological and biological networks. Overall speaking, the resource allocation index performs best. To our surprise, sometimes the weighted indices perform even worse than the unweighted indices, which reminds us of the well-known Weak-Ties Theory. Further experimental study shows that the weak ties play a significant role in the link prediction, and to emphasize the contributions of weak ties can remarkably enhance the prediction accuracy for some networks. We give asemi-quantitative
explanation based on the motif analysis. This letter provides a start point for the possible weak-ties theory in information retrieval.
Linyuan Lü and Tao Zhou -
Europhys. Lett. 89 (2010) 18001 -