Machine Learning for Efficient Neighbor Selection in Unstructured P2P Networks

Robert Beverly and Mike Afergan
Proceedings of USENIX Tackling Computer Systems Problems with Machine Learning Techniques (SysML 2007),
Cambridge, MA, April 2007

Self-reorganization offers the promise of improved performance, scalability and resilience in Peer-to-Peer (P2P) overlays. In practice however, the benefit of reorganization is often lower than the cost incurred in probing and adapting the overlay. We employ machine learning feature selection in a novel manner: to reduce \emph{communication cost} thereby providing the basis of an efficient neighbor selection scheme for P2P overlays. In addition, our method enables nodes within the overlay to locate and attach to peers that are likely to answer \emph{future queries} with no a priori knowledge of the queries.

We evaluate our neighbor classifier against live data from the Gnutella unstructured P2P network. We find Support Vector Machines with forward fitting predict suitable neighbors for future queries with over 90\% accuracy while requiring minimal ($<$2\% of the features) knowledge of the peer's files or type. By providing a means to effectively and efficiently select neighbors in a self-reorganizing overlay, this work serves as a step forward in bringing such architectures to real-world fruition.

[Postscript(344KB)] [PDF(135KB)] [BibTeX]
[Presentation Slides(1280KB)]