DOI: 10.1093/bioinformatics/btae112 ISSN: 1367-4811

Graph-theoretical prediction of biological modules in quaternary structures of large protein complexes

Florian J Gisdon, Mariella Zunker, Jan Niclas Wolf, Kai Prüfer, Jörg Ackermann, Christoph Welsch, Ina Koch
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability

Abstract

Motivation

The functional complexity of biochemical processes is strongly related to the interplay of proteins and their assembly into protein complexes. In recent years, the discovery and characterization of protein complexes have substantially progressed through advances in cryo-electron microscopy, proteomics, and computational structure prediction. This development results in a strong need for computational approaches to analyse the data of large protein complexes for structural and functional characterization. Here, we aim to provide a suitable approach, which processes the growing number of large protein complexes, to obtain biologically meaningful information on the hierarchical organization of the structures of protein complexes.

Methods

We modelled the quaternary structure of protein complexes as undirected, labelled graphs called complex graphs. In complex graphs, the vertices represent protein chains and the edges spatial chain-chain contacts. We hypothesized that clusters based on the complex graph correspond to functional biological modules. To compute the clusters, we applied the Leiden clustering algorithm.

Results

To evaluate our approach, we chose the human respiratory complex I, which has been extensively investigated and exhibits a known biological module structure experimentally validated. Additionally, we characterised a eukaryotic group II chaperonin TRiC/CCT and the head of the bacteriophage Φ29. The analysis of the protein complexes correlated with experimental findings and indicated known functional, biological modules.

Conclusion

Using our approach, enables not only to predict functional, biological modules in large protein complexes but also to investigate the flexibility of specific regions. The predicted modules can aid in the planning and analysis of experiments.

Availability

Jupyter notebooks to reproduce the examples are available on our public GitHub repository https://github.com/MolBIFFM/PTGLtools/tree/main/PTGLmodulePrediction.

Supplementary Information

Supplementary data are available at Bioinformatics online.

More from our Archive