594
Views
15
CrossRef citations to date
0
Altmetric
Perspective

Protein complexes, big data, machine learning and integrative proteomics: lessons learned over a decade of systematic analysis of protein interaction networks

, &
Pages 845-855 | Received 11 Sep 2016, Accepted 29 Aug 2017, Published online: 18 Sep 2017

References

  • Alberts B. The cell as a collection of protein machines: preparing the next generation of molecular biologists. Cell. 1998;92(3):291–294.
  • Hartwell LH, Hopfield JJ, Leibler S, et al. From molecular to modular cell biology. Nature. 1999;402(6761 Suppl):C47–52.
  • Li X, Wu M, Kwoh C-K, et al. Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC Genomics. 2010;11(Suppl 1):S3.
  • Musso GA, Zhang Z, Emili A. Experimental and computational procedures for the assessment of protein complexes on a genome-wide scale. Chem Rev. 2007;107(8):3585–3600.
  • Butland G, Peregrín-Alvarez JM, Li J, et al. Interaction network containing conserved and essential protein complexes in escherichia coli. Nature. 2005;433(7025):531–537.
  • Hu P, Janga SC, Babu M, et al. Global functional atlas of escherichia coli encompassing previously uncharacterized proteins. PLoS Biol. 2009;7(4):e1000096.
  • Babu M, Vlasblom J, Pu S, et al. Interaction landscape of membrane-protein complexes in saccharomyces cerevisiae. Nature. 2012;489(7417):585–589.
  • Krogan NJ, Cagney G, Yu H, et al. Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature. 2006;440(7084):637–643.
  • Wan C, Borgeson B, Phanse S, et al. Panorama of ancient metazoan macromolecular complexes. Nature. 2015;525(7569):339–344.
  • Marcon E, Ni Z, Pu S, et al. Human-chromatin-related protein interactions identify a demethylase complex required for chromosome segregation. Cell Rep. 2014;8(1):297–310.
  • Havugimana PC, Hart GT, Nepusz T, et al. A census of human soluble protein complexes. Cell. 2012;150(5):1068–1081.
  • Havugimana PC. Global proteomic detection of native, stable, soluble human protein complexes. PhD Thesis, University of Toronto, ON. [cited 2017 Jul 11]. Available from: https://tspace.library.utoronto.ca/1807/34035
  • Phanse S, Wan C, Borgeson B, et al. Proteome-wide dataset supporting the study of ancient metazoan macromolecular complexes. Data Br. 2016;6:715–721.
  • Gingras A-C, Gstaiger M, Raught B, et al. Analysis of protein complexes using mass spectrometry. Nat Rev Mol Cell Biol. 2007;8(8):645–654.
  • Köcher T, Superti-Furga G. Mass spectrometry-based functional proteomics: from molecular machines to protein networks. Nat Methods. 2007;4(10):807–815.
  • Li Y. The tandem affinity purification technology: an overview. Biotechnol Lett. 2011;33(8):1487–1499.
  • Xu X, Song Y, Li Y, et al. The tandem affinity purification method: an efficient system for protein complex purification and protein interaction identification. Protein Expr Purif. 2010;72(2):149–156.
  • Liang S, Shen G, Xu X, et al. Affinity purification combined with mass spectrometry-based proteomic strategy to study mammalian protein complex and protein-protein interactions. Curr Proteomics. 2009;6(1):25-31.
  • Rigaut G, Shevchenko A, Rutz B, et al. A generic protein purification method for protein complex characterization and proteome exploration. Nat Biotechnol. 1999;17(10):1030–1032.
  • Trahan C, Aguilar L-C, Oeffinger M. Single-step affinity purification (ssAP) and mass spectrometry of macromolecular complexes in the yeast S. cerevisiae. Methods Mol Biol. 2016;1361:265–287.
  • Fenn JB, Mann M, Meng CK, et al. Electrospray ionization for mass spectrometry of large biomolecules. Science. 1989;246(4926):64–71.
  • Makowski MM, Willems E, Jansen PWTC, et al. Cross-linking immunoprecipitation-MS (xIP-MS): topological analysis of chromatin-associated protein complexes using single affinity purification. Mol Cell Proteomics. 2016;15(3):854–865.
  • Politis A, Stengel F, Hall Z, et al. A mass spectrometry–based hybrid method for structural modeling of protein complexes. Nat Methods. 2014;11(4):403–406.
  • Fields S, Song O. A novel genetic system to detect protein-protein interactions. Nature. 1989;340(6230):245–246.
  • Rual J-F, Venkatesan K, Hao T, et al. Towards a proteome-scale map of the human protein–protein interaction network. Nature. 2005;437(7062):1173–1178.
  • Rolland T, Taşan M, Charloteaux B, et al. A proteome-scale map of the human interactome network. Cell. 2014;159(5):1212–1226.
  • Tarassov K, Messier V, Landry CR, et al. An in vivo map of the yeast protein interactome. Science (80-.). 2008;320(5882):1465–1470.
  • Gavin A-C, Aloy P, Grandi P, et al. Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006;440(7084):631–636.
  • Huttlin EL, Ting L, Bruckner RJ, et al. The BioPlex network: a systematic exploration of the human interactome. Cell. 2015;162(2):425–440.
  • Hein MY, Hubner NC, Poser I, et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 2015;163(3):712–723.
  • Huttlin EL, Bruckner RJ, Paulo JA, et al. Architecture of the human interactome defines protein communities and disease networks. Nature. 2017;545(7655):505–509.
  • Kühner S, van Noort V, Betts MJ, et al. Proteome organization in a genome-reduced bacterium. Science. 2009;326(5957):1235–1240.
  • Guruharsha KG, Rual J-F, Zhai B, et al. A protein complex network of drosophila melanogaster. Cell. 2011;147(3):690–703.
  • Braun P, Tasan M, Dreze M, et al. An experimentally derived confidence score for binary protein-protein interactions. Nat Methods. 2009;6(1):91–97.
  • Wang H, Huang H, Ding C, et al. Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization. J Comput Biol. 2013;20(4):344–358.
  • Li T, Wernersson R, Hansen RB, et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat Methods. 2016;14(1):61–64.
  • Yeast soluble protein complexes. [cited 2017 Jul 11]. Available from: http://tap.med.utoronto.ca/exttap/
  • Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002;30(7):1575–1584.
  • Mewes HW, Amid C, Arnold R, et al. MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 2004;32(Database issue):D41–44.
  • Goll J, Uetz P. The elusive yeast interactome. Genome Biol. 2006;7(6):223.
  • Collins SR, Kemmeren P, Zhao X-C, et al. Toward a comprehensive atlas of the physical interactome of saccharomyces cerevisiae. Mol Cell Proteomics. 2007;6(3):439–450.
  • Hart GT, Lee I, Marcotte ER. A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics. 2007;8:236.
  • Pu S, Vlasblom J, Emili A, et al. Identifying functional modules in the physical interactome of saccharomyces cerevisiae. Proteomics. 2007;7(6):944–960.
  • Wu CC, Yates JR. The application of mass spectrometry to membrane proteomics. Nat Biotechnol. 2003;21(3):262–267.
  • Yeast membrane complexes. [cited 2017 Jul 11]. Available from: http://wodaklab.org/membrane/
  • Zeghouf M, Li J, Butland G, et al. Sequential Peptide Affinity (SPA) system for the identification of mammalian and bacterial protein complexes. J Proteome Res. 2004;3(3):463–468.
  • Babu M, Butland G, Pogoutse O, et al. Sequential peptide affinity purification system for the systematic isolation and identification of protein complexes from escherichia coli. Methods Mol Biol. 2009;564:373–400.
  • Taniguchi Y, Choi PJ, Li G-W, et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science. 2010;329(5991):533–538.
  • E.coli protein complexes. [cited 2017 Jul 14]. Available from: http://ecoli.med.utoronto.ca/
  • Hu P, Jiang H, Emili A. Predicting protein functions by relaxation labelling protein interaction network. BMC Bioinformatics. 2010;11(Suppl 1):S64.
  • Hu P, Bader G, Wigle DA, et al. Computational prediction of cancer-gene function. Nat Rev Cancer. 2007;7(1):23–34.
  • Chua HN, Sung W-K, Wong L. Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics. 2006;22(13):1623–1630.
  • Massjouni N, Rivera CG, Murali TM. VIRGO: computational prediction of gene functions. Nucleic Acids Res. 2006;34(Web Server issue):W340–344.
  • Nabieva E, Jim K, Agarwal A, et al. Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics. 2005;21(Suppl 1):i302–310.
  • Schwikowski B, Uetz P, Fields S. A network of protein-protein interactions in yeast. Nat Biotechnol. 2000;18(12):1257–1261.
  • Vazquez A, Flammini A, Maritan A, et al. Global protein function prediction from protein-protein interaction networks. Nat Biotechnol. 2003;21(6):697–700.
  • Zhao X-M, Wang Y, Chen L, et al. Gene function prediction using labeled and unlabeled data. BMC Bioinformatics. 2008;9:57.
  • Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005;4:Article17.
  • Murali TM, Wu C-J, Kasif S. The art of gene function prediction. Nat Biotechnol. 2006;24(12):1474–1475.
  • Deng M, Zhang K, Mehta S, et al. Prediction of protein function using protein-protein interaction data. J Comput Biol. 2003;10(6):947–960.
  • Letovsky S, Kasif S. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics. 2003;19(Suppl 1):i197–204.
  • Sharan R, Ulitsky I, Shamir R. Network-based prediction of protein function. Mol Syst Biol. 2007;3:88.
  • Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25(1):25–29.
  • Jansen R, Gerstein M. Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr Opin Microbiol. 2004;7(5):535–545.
  • Lord PW, Stevens RD, Brass A, et al. Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics. 2003;19(10):1275–1283.
  • Resnik P. Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artificial Intell Res. 1999;11:95–130.
  • Drew K, Lee C, Huizar RL, et al. Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes. Mol Syst Biol. 2017;13(6):932.
  • Dalvai M, Loehr J, Jacquet K, et al. A scalable genome-editing-based approach for mapping multiprotein complexes in human cells. Cell Rep. 2015;13(3):621–633.
  • Ruepp A, Waegele B, Lechner M, et al. CORUM: the comprehensive resource of mammalian protein complexes—2009. Nucleic Acids Res. 2010;38(suppl_1):D497–D501.
  • Human soluble protein complexes. [cited 2017 Jul 14]. Available from: http://human.med.utoronto.ca/
  • Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods. 2012;9(5):471–472.
  • Metazoa protein complexes. [cited 2017 Jul 11]. Available from: http://metazoa.med.utoronto.ca/
  • Vidal M, Cusick ME, Barabási A-L, et al. Interactome networks and human disease. Cell. 2011;144(6):986–998.
  • Yeger-Lotem E, Sharan R. Human protein interaction networks across tissues and diseases. Front Genet. 2015;6:257.
  • Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68.
  • Xenarios I, Salwínski L, Duan XJ, et al. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002;30(1):303–305.
  • Orchard S, Ammari M, Aranda B, et al. The MIntAct project—intAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 2014;42(D1):D358–D363.
  • Ceol A, Chatr Aryamontri A, Licata L, et al. MINT, the molecular interaction database: 2009 update. Nucleic Acids Res. 2010;38(Database issue):D532–9.
  • Launay G, Salza R, Multedo D, et al. MatrixDB, the extracellular matrix interaction database: updated content, a new navigator and expanded functionalities. Nucleic Acids Res. 2015;43(D1):D321–D327.
  • Chatr-Aryamontri A, Oughtred R, Boucher L, et al. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2017;45(D1):D369–D379.
  • Goll J, Rajagopala SV, Shiau SC, et al. MPIDB: the microbial protein interaction database. Bioinformatics. 2008;24(15):1743–1744.
  • Orchard S, Kerrien S, Abbani S, et al. Protein interaction data curation: the international molecular exchange (IMEx) consortium. Nat Methods. 2012;9(4):345–350.
  • Vizcaíno JA, Deutsch EW, Wang R, et al. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol. 2014;32(3):223–226.
  • Ideker T, Krogan NJ. Differential network biology. Mol Syst Biol. 2012;8(565):565.
  • Wan C, Liu J, Fong V, et al. ComplexQuant: high-throughput computational pipeline for the global quantitative analysis of endogenous soluble protein complexes using high resolution protein HPLC and precision label-free LC/MS/MS. J Proteomics. 2013;81:102–111.
  • Kristensen AR, Gsponer J, Foster LJ. A high-throughput approach for measuring temporal changes in the interactome. Nat Methods. 2012;9(9):907–909.
  • Bensimon A, Heck AJ, Aebersold R. Mass spectrometry-based proteomics and network biology. Annu Rev Biochem. 2012;81:379–405.
  • Zuberi K, Franz M, Rodriguez H, et al. GeneMANIA prediction server 2013 update. Nucleic Acids Res. 2013;41(W1):W115–W122.
  • Cowley MJ, Pinese M, Kassahn KS, et al. PINA v2.0: mining interactome modules. Nucleic Acids Res. 2012;40(D1):D862–D865.
  • Kamburov A, Pentchev K, Galicka H, et al. ConsensusPathDB: toward a more complete picture of cell biology. Nucleic Acids Res. 2011;39(Database):D712–D717.
  • Franceschini A, Szklarczyk D, Frankild S, et al. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 2013;41(Database issue):D808–15.
  • LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–444.
  • Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–1958.
  • Pavlopoulos GA, O’Donoghue SI, Satagopam VP, et al. Arena3D: visualization of biological networks in 3D. BMC Syst Biol. 2008;2(1):104.
  • Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–2504.
  • Zhu J, Shi Z, Wang J, et al. Empowering biologists with multi-omics data: colorectal cancer as a paradigm. Bioinformatics. 2015;31(9):1436–1443.
  • Roux KJ, Kim DI, Raida M, et al. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J Cell Biol. 2012;196(6):801–810.
  • Heck AJR. Native mass spectrometry: a bridge between interactomics and structural biology. Nat Methods. 2008;5(11):927–933.
  • Rose RJ, Damoc E, Denisov E, et al. High-sensitivity orbitrap mass analysis of intact macromolecular assemblies. Nat Methods. 2012;9(11):1084–1086.
  • Scheltema RA, Hauschild J-P, Lange O, et al. The Q exactive HF, a benchtop mass spectrometer with a pre-filter, high-performance quadrupole and an ultra-high-field orbitrap analyzer. Mol Cell Proteomics. 2014;13(12):3698–3708.
  • van de Waterbeemd M, Fort KL, Boll D, et al. High-fidelity mass analysis unveils heterogeneity in intact ribosomal particles. Nat Methods. 2017;14(3):283–286.
  • Skinner OS, Havugimana PC, Haverland NA, et al. An informatic framework for decoding protein complexes by top-down mass spectrometry. Nat Methods. 2016;13(3):237–240.
  • Kirkwood KJ, Ahmad Y, Larance M, et al. Characterization of native protein complexes and protein isoform variation using size-fractionation-based quantitative proteomics. Mol Cell Proteomics. 2013;12(12):3851–3873.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.