Current Protein and Peptide Science - Volume 11, Issue 7, 2010
Volume 11, Issue 7, 2010
-
-
Editorial [Hot topic: Protein Folding, Stability and Interactions (Guest Editor: M. Michael Gromiha)]
More LessProteins and their interactions play vital roles in living organisms. Elucidating the mechanism of protein folding as well as the recognition of protein complexes are intriguing and challenging problems in protein science. Recent years have shown tremendous advances to our knowledge of protein folding, stability and their interactions. The problem of protein folding, stability and interactions has been viewed through several perspectives using experimental and computational approaches. The special issue on “Protein folding, stability and interactions” is aimed at providing a recent update on the folding and stability of proteins and their interactions with other molecules (proteins, nucleic acids, carbohydrates and ligands). It also addresses the significance of the analysis of protein structures, secondary and tertiary structure predictions and folding rates of proteins. The special issue is broadly classified into two parts; the first part is focused on the aspects of protein folding and stability with seven articles and the second part is devoted to protein interactions, which has five papers. The opening article by Shenoy and Jayaram [1] provided the state of the art in protein three-dimensional structure prediction along with disordered proteins and protein-protein interactions. Gaspari et al. [2] analyzed the dynamically restrained conformational ensemble of proteins generated from residual dipolar coupling data in terms of protruding and buried atoms as well as inter-atomic distances with ubiquitin as an example. Galzitskaya [3] proposed Monte Carlo and Capillarity models for understanding and predicting protein folding rates using protein three-dimensional structures. The overview of SBASE domain library, a curated collection of domain sequences and standard similarity search algorithms, which is based on a simple statistics of the domain similarity network, has been illustrated by Dhir et al. [4]. They showed that this method is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. Tusnady and Simon [5] discussed the current status of topology prediction in transmembrane helical proteins and the annotation of such proteins in genomic sequences. Verma et al. [6] employed a minimal Distance Constraint Model to predict the stability of a series of lysozyme mutants and proposed probable mutants for characterization. The effects of missense mutations on functional stability have been studied with protein sequence analysis algorithms such as spatial distributions of sequence, evolutionary, and physicochemical conservation by Horst et al. [7]. Rashid et al. [8] developed a support vector machine (SVM) based approach using position specific scoring matrices (PSSM) for identifying the interacting proteins in genomic sequences. The combination of hidden Markov models and support vector machines has been used to predict the interaction sites in genome-wide interaction networks by Bartoli et al. [9]. Zhang et al. [10] analyzed a set of RNA binding proteins and developed a method for identifying the RNA binding residues in proteins using evolutionary conservation and predicted secondary structure and solvent accessibility. The recognition mechanism of protein-RNA complexes has been studied with a novel energy based apporach by Gromiha et al. [11]. Carugo and Carugo [12] reviewed the structures of human filamin with emphasis on the relationship between structure, function and interaction. In essence, this special issue comprehends the exciting developments in the area of protein folding, stability and interactions, and it will be a valuable resource for computational biologists, biochemists, biophysicists, bioinformaticians and researchers working in the field of proteins. I would like to thank all the authors for their outstanding contributions and cooperation to complete the task. The guest editor also thanks the Editor-in-Chief Professor Ben M. Dunn for his invitation, encouragement, and support for the successful completion of the special issue.
-
-
-
Proteins: Sequence to Structure and Function - Current Status
Authors: Sandhya R. Shenoy and B. JayaramIn an era that has been dominated by Structural Biology for the last 30-40 years, a dramatic change of focus towards sequence analysis has spurred the advent of the genome projects and the resultant diverging sequence/structure deficit. The central challenge of Computational Structural Biology is therefore to rationalize the mass of sequence information into biochemical and biophysical knowledge and to decipher the structural, functional and evolutionary clues encoded in the language of biological sequences. In investigating the meaning of sequences, two distinct analytical themes have emerged: in the first approach, pattern recognition techniques are used to detect similarity between sequences and hence to infer related structures and functions; in the second ab initio prediction methods are used to deduce 3D structure, and ultimately to infer function, directly from the linear sequence. In this article, we attempt to provide a critical assessment of what one may and may not expect from the biological sequences and to identify major issues yet to be resolved. The presentation is organized under several subtitles like protein sequences, pattern recognition techniques, protein tertiary structure prediction, membrane protein bioinformatics, human proteome, protein-protein interactions, metabolic networks, potential drug targets based on simple sequence properties, disordered proteins, the sequence-structure relationship and chemical logic of protein sequences.
-
-
-
Probing Dynamic Protein Ensembles with Atomic Proximity Measures
The emerging role of internal dynamics in protein fold and function requires new avenues of structure analysis. We analyzed the dynamically restrained conformational ensemble of ubiquitin generated from residual dipolar coupling data, in terms of protruding and buried atoms as well as interatomic distances, using four proximity-based algorithms, CX, DPX, PRIDE and PRIDE-NMR (http://hydra.icgeb.trieste.it/protein/). We found that Ubiquitin, this relatively rigid molecule has a highly diverse dynamic ensemble. The environment of protruding atoms is highly variable across conformers, on the other hand, only a part of buried atoms tends to fluctuate. The variability of the ensemble cautions against the use of single conformers when explaining functional phenomena. We also give a detailed evaluation of PRIDE-NMR on a wide dataset and discuss its usage in the light of the features of available NMR distance restraint sets in public databases.
-
-
-
Estimation of Protein Folding Rate from Monte Carlo Simulations and Entropy Capacity
More LessThe problem of protein self-organization is one of the most important problems of molecular biology nowadays. Despite the recent success in the understanding of general principles of protein folding, details of this process are yet to be elucidated. Moreover, the prediction of protein folding rates has its own practical value due to the fact that aggregation directly depends on the rate of protein folding. The time of folding has been calculated for 67 proteins with known experimental data at the point of thermodynamic equilibrium between unfolded and native states using a Monte Carlo model where each residue is considered to be either folded as in the native state or completely disordered. The times of folding for 67 proteins which reach the native state within the limit of 108 Monte Carlo steps are in a good correlation with the experimentally measured folding rate at the mid-transition point (the correlation coefficient is -0.82). Theoretical consideration of a capillarity model for the process of protein folding demonstrates that the difference in the folding rate for proteins sharing more spherical and less spherical folds is the result of differences in the conformational entropy due to a larger surface of the boundary between folded and unfolded phases in the transition state for proteins with more spherical fold. The capillarity model allows us to predict the folding rate at the same level of correlation as by Monte Carlo simulations. The calculated model entropy capacity (conformational entropy per residue divided by the average contact energy per residue) for 67 proteins correlates by about 78% with the experimentally measured folding rate at the mid-transition point.
-
-
-
Detecting Atypical Examples of Known Domain Types by Sequence Similarity Searching: The SBASE Domain Library Approach
SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al., Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.
-
-
-
Topology Prediction of Helical Transmembrane Proteins: How Far Have We Reached?
Authors: Gabor E. Tusnady and Istvan SimonTransmembrane protein topology prediction methods play important roles in structural biology, because the structure determination of these types of proteins is extremely difficult by the common biophysical, biochemical and molecular biological methods. The need for accurate prediction methods is high, as the number of known membrane protein structures fall far behind the estimated number of these proteins in various genomes. The accuracy of these prediction methods appears to be higher than most prediction methods applied on globular proteins, however it decreases slightly with the increasing number of structures. Unfortunately, most prediction algorithms use common machine learning techniques, and they do not reveal why topologies are predicted with such a high success rate and which biophysical or biochemical properties are important to achieve this level of accuracy. Incorporating topology data determined so far into the prediction methods as constraints helps us to reach even higher prediction accuracy, therefore collection of such topology data is also an important issue.
-
-
-
Predicting the Melting Point of Human C-Type Lysozyme Mutants
Authors: Deeptak Verma, Donald J. Jacobs and Dennis R. LivesayA complete understanding of the relationships between protein structure and stability remains an open problem. Much of our insight comes from laborious experimental analyses that perturb structure via directed mutation. The glycolytic enzyme lysozyme is among the most well characterized proteins under this paradigm, due to its abundance and ease of manipulation. To speed up such analyses, efficient computational models that can accurately predict mutation effects are needed. We employ a minimal Distance Constraint Model (mDCM) to predict the stability of a series of lysozyme mutants (specifically, human wild-type C-type lysozyme and 14 point mutations). With three phenomenological parameters that characterize microscopic interactions, the mDCM parameters are determined by obtaining the least squares error between predicted and experimental heat capacity curves. The mutants are chemically and structurally diverse, but have been experimentally characterized under nearly identical thermodynamic conditions (pH, ionic strength, etc.). The parameters found from best fits to heat capacity curves for one or more lysozyme structures are subsequently used to predict the heat capacity on the remaining. We simulate a typical experimental situation, where prediction of relative stabilities in an untested mutated structure is based on known results as they accumulate. From the statistical significance of these simulations, we establish that the mDCM is a viable predictor for relative stability of protein mutants. Remarkably, using parameters from any single fitting yields an average percent error of 4.3%. Across the dataset, the mDCM reproduces experimental trends sufficiently well (R = 0.64) to be of practical value to experimentalists when making decisions about which mutations to invest time and funds for characterization.
-
-
-
Disease Risk of Missense Mutations Using Structural Inference from Predicted Function
Authors: Jeremy A. Horst, Kai Wang, Orapin V. Horst, Michael L. Cunningham and Ram SamudralaAdvancements in sequencing techniques place personalized genomic medicine upon the horizon, bringing along the responsibility of clinicians to understand the likelihood for a mutation to cause disease, and of scientists to separate etiology from nonpathologic variability. Pathogenicity is discernable from patterns of interactions between a missense mutation, the surrounding protein structure, and intermolecular interactions. Physicochemical stability calculations are not accessible without structures, as is the case for the vast majority of human proteins, so diagnostic accuracy remains in infancy. To model the effects of missense mutations on functional stability without structure, we combine novel protein sequence analysis algorithms to discern spatial distributions of sequence, evolutionary, and physicochemical conservation, through a new approach to optimize component selection. Novel components include a combinatory substitution matrix and two heuristic algorithms that detect positions which confer structural support to interaction interfaces. The method reaches 0.91 AUC in ten-fold cross-validation to predict alteration of function for 6,392 in vitro mutations. For clinical utility we trained the method on 7,022 disease associated missense mutations within the Online Mendelian inheritance in man amongst a larger randomized set. In a blinded prospective test to delineate mutations unique to 186 patients with craniosynostosis from those in the 95 highly variant Coriell controls and 2000 control chromosomes, we achieved roughly 1/3 sensitivity and perfect specificity. The component algorithms retained during machine learning constitute novel protein sequence analysis techniques to describe environments supporting neutrality or pathology of mutations. This approach to pathogenetics enables new insight into the mechanistic relationship of missense mutations to disease phenotypes in our patients.
-
-
-
A Simple Approach for Predicting Protein-Protein Interactions
Authors: Mamoon Rashid, Sumathy Ramasamy and Gajendra P.S. RaghavaThe availability of an increased number of fully sequenced genomes demands functional interpretation of the genomic information. Despite high throughput experimental techniques and in silico methods of predicting protein-protein interaction (PPI); the interactome of most organisms is far from completion. Thus, predicting the interactome of an organism is one of the major challenges in the post-genomic era. This manuscript describes Support Vector Machine (SVM) based models that have been developed for discriminating interacting and non-interacting pairs of proteins from their amino acid sequence. We have developed SVM models using various types of sequence compositions e.g. amino acid, dipeptide, biochemical property, split amino acid and pseudo amino acid composition. We also developed SVM models using evolutionary information in the form of Position Specific Scoring Matrix (PSSM) composition. We achieved maximum Matthews correlation coefficient (MCC) of 1.00, 0.52 and 0.74 for Escherichia coli, Saccharomyces cerevisiae, and Helicobacter pylori, using dipeptide based SVM model at default threshold. It was observed that the performance of a prediction model depends on the dataset used for training and testing. In case of E. coli MCC decreased from 1.0 to 0.67 when evaluated on a new dataset. In order to understand PPI in different cellular environment, we developed speciesspecific and general models. It was observed that species-specific models are more accurate than general models. We conclude that the primary amino acid sequence based descriptors could be used to differentiate interacting from noninteracting protein pairs. Some amino acids tend to be favored in interacting pairs than non-interacting ones. Finally, a web server has been developed for predicting protein-protein interactions.
-
-
-
The Prediction of Protein-Protein Interacting Sites in Genome-Wide Protein Interaction Networks: The Test Case of the Human Cell Cycle
Authors: L. Bartoli, P. L. Martelli, I. Rossi, P. Fariselli and R. CasadioIn this paper we aim at investigating possible correlations between the number of putative interaction patches of a given protein, as inferred by an algorithm that we have developed, and its degree (number of edges of the protein node in a protein interaction network). We focus on the human cell cycle that, as compared with other biological processes, comprises the largest number of proteins whose structure is known at atomic resolution both as monomers and as interacting complexes. For predicting interaction patches we specifically develop a HM-SVM based method reaching 71% overall accuracy with a correlation coefficient value equal to 0.43 on a non redundant set of protein complexes. To test the biological meaning of our predictions, we also explore whether interacting patches contain energetically important residues and/or disease related mutations and find that predicted patches are endowed with both features. Based on this, we propose that mapping the protein with all the predicted interaction patches bridges the molecule to the interactome at the cell level. To test our hypothesis we downloaded interaction data from interaction data bases and find that the number of predicted interaction patches significantly correlates (Pearson correlation value >0.3) with the number of the known interactions (edges) per protein in the human interactome, as contained in MINT and IntAct. We also show that the correlation increases (Pearson correlation value >0.5) when the subcellular co-localization and the co-expression levels of the interacting partners are taken into account.
-
-
-
Analysis and Prediction of RNA-Binding Residues Using Sequence, Evolutionary Conservation, and Predicted Secondary Structure and Solvent Accessibility
Authors: Tuo Zhang, Hua Zhang, Ke Chen, Jishou Ruan, Shiyi Shen and Lukasz KurganIdentification and prediction of RNA-binding residues (RBRs) provides valuable insights into the mechanisms of protein-RNA interactions. We analyzed the contributions of a wide range of factors including amino acid sequence, evolutionary conservation, secondary structure and solvent accessibility, to the prediction/characterization of RBRs. Five feature sets were designed and feature selection was performed to find and investigate relevant features. We demonstrate that (1) interactions with positively charged amino acids Arg and Lys are preferred by the negatively charged nucleotides; (2) Gly provides flexibility for the RNA binding sites; (3) Glu with negatively charged side chain and several hydrophobic residues such as Leu, Val, Ala and Phe are disfavored in the RNA-binding sites; (4) coil residues, especially in long segments, are more flexible (than other secondary structures) and more likely to interact with RNA; (5) helical residues are more rigid and consequently they are less likely to bind RNA; and (6) residues partially exposed to the solvent are more likely to form RNA-binding sites. We introduce a novel sequence-based predictor of RBRs, RBRpred, which utilizes the selected features. RBRpred is comprehensively tested on three datasets with varied atom distance cutoffs by performing both five-fold cross validation and jackknife tests and achieves Matthew's correlation coefficient (MCC) of 0.51, 0.48 and 0.42, respectively. The quality is comparable to or better than that for state-of-the-art predictors that apply the distancebased cutoff definition. We show that the most important factor for RBRs prediction is evolutionary conservation, followed by the amino acid sequence, predicted secondary structure and predicted solvent accessibility. We also investigate the impact of using native vs. predicted secondary structure and solvent accessibility. The predictions are sufficient for the RBR prediction and the knowledge of the actual solvent accessibility helps in predictions for lower distance cutoffs.
-
-
-
Understanding the Recognition Mechanism of Protein-RNA Complexes Using Energy Based Approach
Authors: M. Michael Gromiha, Kiyonobu Yokota and Kazuhiko FukuiProtein-RNA interactions perform diverse functions within the cell. Understanding the recognition mechanism of protein-RNA complexes is a challenging task in molecular and computational biology. In this work, we have developed an energy based approach for identifying the binding sites and important residues for binding in protein-RNA complexes. The new approach considers the repulsive interactions as well as the effect of distance between the atoms in protein and RNA in terms of interaction energy, which are not considered in traditional distance based methods to identify the binding sites. We found that the positively charged, polar and aromatic residues are important for binding. These residues influence to form electrostatic, hydrogen bonding and stacking interactions. Our observation has been verified with the experimental binding specificity of protein-RNA complexes and found good agreement with experiments. Further, the propensities of residues/nucleotides in the binding sites of proteins/RNA and their atomic contributions have been derived. Based on these results we have proposed a novel mechanism for the recognition of protein-RNA complexes: the charged and polar residues in proteins initiate recognition with RNA by making electrostatic and hydrogen bonding interactions between them; the aromatic side chains tend to form aromatic-aromatic interactions and the hydrophobic residues aid to stabilize the complex.
-
-
-
Structural Portrait of Filamin Interaction Mechanisms
Authors: Kristina Djinovic-Carugo and Oliviero CarugoWe review the most recent findings on human filamin structure, with particular emphasis on the relationships between structure, function, and interaction. Filamin is a cytoskeletal actin-binding protein and it is therefore crucial in providing cells with the necessary mechanical and dynamical properties. Filamentous actin cross-linking by filamin is regulated by a number of other proteins and the molecular mechanisms of this complex interaction network can be understood by highlighting the structural features of isolated filamin moieties and of their complexes with several partners. Here we describe first the structure-function relationships of the isolated filamin, its flexibility, and its dimerization mechanism. Secondly, we illustrate the structural mechanism with which filamin can recognize its partners, both the actin filaments and the regulatory proteins.
-
Volumes & issues
-
Volume 26 (2025)
-
Volume (2025)
-
Volume 25 (2024)
-
Volume 24 (2023)
-
Volume 23 (2022)
-
Volume 22 (2021)
-
Volume 21 (2020)
-
Volume 20 (2019)
-
Volume 19 (2018)
-
Volume 18 (2017)
-
Volume 17 (2016)
-
Volume 16 (2015)
-
Volume 15 (2014)
-
Volume 14 (2013)
-
Volume 13 (2012)
-
Volume 12 (2011)
-
Volume 11 (2010)
-
Volume 10 (2009)
-
Volume 9 (2008)
-
Volume 8 (2007)
-
Volume 7 (2006)
-
Volume 6 (2005)
-
Volume 5 (2004)
-
Volume 4 (2003)
-
Volume 3 (2002)
-
Volume 2 (2001)
-
Volume 1 (2000)
Most Read This Month
