A random walk based approach for improving protein-protein interaction network and protein complex prediction

Date

2011-12

Authors

Lei, Chengwei
Ruan, Jianhua

Journal Title

Journal ISSN

Volume Title

Publisher

UTSA Department of Computer Science

Abstract

Motivation: Recent advances in high-throughput technology have dramatically increased the availability of protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of PPI networks. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex.

Results: Applying our algorithm to a yeast PPI network, we found that the interactions in the reconstructed network have higher biological relevance than in the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with existing methods shows that the network reconstructed by our method has the highest quality. Using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes. Furthermore, our method is applicable to PPI networks obtained with different experimental systems such as affinity purification, Y2H, and PCA, and evidence shows that the predicted edges are likely bona fide physical interactions.

Description

Keywords

Citation

Department

Computer Science