Current Proteomics - Volume 6, Issue 4, 2009
Volume 6, Issue 4, 2009
-
-
Editorial
More LessIn our days, there is an explosion on the use of Topological Indices (TIs) and Connectivity Indices (CIs) described from graph theory to study Complex Networks on a broad spectrum of topics related to Bioinformatics and Proteomics. These topics cover many biomedical fields from Virology, Parasitology, and Microbiology in general to Toxicology, and Cancer research to cite only some of the more investigated. The main reason for this success of TIs/CIs, is the high flexibility of this theory to solve in a fast but rigorous way many apparently unrelated problems in all these disciplines. This determined the recent development of several interesting software and theoretical methods to handle with structure-function information and data mining in this field. In a recent, preliminary review in the field Gonzalez-Diaz H et al. Proteomics (2008) 8, 750-778, we noted that these software and methods may work at different structural levels including: -structure of protein ligand drugs, -protein structure, -protein-protein, protein-DNA and other types of protein involving interactions. -protein mediated cell-to-cell or organism-organism interactions. -numerical description of 2D electrophoretic proteomics maps. -prediction of protein fragmentation connected to mass spectra. -numerical description of whole blood proteome Mass Spectra and other topics. In any case, in only one manuscript is very difficult to zip all this information. So it is necessary a topic issue because many of the users of these programs limit to a narrow field of application and ignore the several applications at different proteomics levels. On the other hand, many researchers, which move by the frontiers of these fields, miss a journal issue reviewing the actual applications and future perspectives of these software and methods and the possible relationships of data flow between them in a common theoretic framework. Such a collection of papers could be of the major interest for many specialists on proteomics and may increase the interchange between these specialists of different but related fields with a common root: proteomics and graph theory. In addition, it could be the seed for further improvement of software performance and compatibility. Taking into consideration all these aspects, Current Proteomics presents this special issue composed by a collection of papers devoted to review the common theoretic basis, applications, and inter-connections between the inputs and outputs of some of the more used Cheminformatics-Bioinformatics and Data Mining software or methods (about one paper per method) that enable calculation of TIs and their applications to Proteomics. We hope that the present issue may serve as a bridge between theoretical scientists in graph theory and experimentalists in proteomics in order to suggest new areas of mutual interchange and collaboration.
-
-
-
Topological Charge-Transfer Indices: From Small Molecules to Proteins
More LessAuthors: Francisco Torrens and Gloria CastellanoValence-topological charge-transfer indices are applied to the calculation of dipole moment-pH at the isoelectric point. Dipole moments calculated by algebraic-vector semisums of charge-transfer indices are defined. The ability of indices, for the description of molecular charge distribution, is established by comparing them with the dipole moment of the valence-isoelectronic series of cyclopentadiene-benzene-styrene. Both charge-transfer indices are proposed: vector semisums μvec-μvec V. The μvec V is intermediate between μvec and μexperiment. The steric effect is almost constant along series and the dominating effect is electronic. The indices are applied to the calculation of the dipole moments of the homologous series of percutaneous enhancers and the isoelectric point of 21 amino acids. In most fits no superimposition of the corresponding Gk-Jk/Gk VJk V pairs is observed, which diminishes the risk of collinearity. The inclusion of heteroatoms in the π-electron systems is beneficial for the description of isoelectric point, because of either the role of additional p-orbitals provided by heteroatom or the role of steric factors in the π-electron conjugation. The use of (valence) chargetransfer indices gives limited results for amino-acid isoelectric points. The inclusion of the number of acidic/basic groups improves the models, especially for amino acids with more than two functional groups. The fitting line for 21 amino acids is used to estimate the lysozyme isoelectric point by replacing (1+Δn/nT) with (M+Δn)/nT. The lysozyme fragment results can estimate the isoelectric point of the whole protein within 1-13% error.
-
-
-
QSAR Models for Proteins of Parasitic Organisms, Plants and Human Guests: Theory, Applications, Legal Protection, Taxes, and Regulatory Issues
More LessThe Quantitative Structure-Property Relationship (QSPR) models based on Graph or Network theory are important to represent and predict interesting properties of low-molecular-weight compounds. The graph parameters called Topological Indices (TIs) are useful to link the molecular structure with physicochemical and biological properties. However, there have been recent efforts to extend these methods to the study of proteins and whole proteomes as well. In this case, we are in the presence of Quantitative Protein/Proteome-Property Relationship (QPPR) models, by analogy to QSPR. In the present work we review, discuss, and outline some perspectives on the use of these QPPR techniques applied to single proteins of Parasitic Organisms, Plants and Human Guests. We make emphasis on the different types of graphs and network representations of proteins, the structural information codified by different protein TIs, the statistical or machine learning techniques used and the biological properties predicted. This article also provides a reference to the various legal avenues that are available for the protection of software used in proteins QSAR; as well as the acceptance and legal treatment of scientific results and techniques derived from such software. We also make reference to the recent implementation by Munteanu and Gonzalez-Diaz of the internet portal called BioAims freely available for the use of the international research community. This portal includes the web-server packages TargetPred with two new Protein-QSAR servers: ATCUNPred (http://miaja.tic.udc.es/Bio-AIMS/ATCUNPred.php) for prediction of ATCUN-mediated DNAclevage anticancer proteins and EnzClassPred for prediction of enzyme classes (http://miaja.tic.udc.es/Bio- AIMS/EnzClassPred.php). Last we included an overview of relevant topics related to legal protection, regulation, and international tax issues involved in practical use of this type of models and software in proteomics.
-
-
-
Computational Analysis of Amino Acid Mutation: A Proteome Wide Perspective
More LessAuthors: Jiajia Chen and Bairong ShenAmino acid mutations may have diverse effects on protein structure and function. Thus reliable information about the protein sequence variations is essential to gain insights into disease genotype-phenotype correlations. With the recent availability of the complete genome sequence and the accumulation of variation data, determining the effects of amino acid substitution will be the next challenge in mutation research. The molecular consequences of amino acid mutations can readily be predicted by numerous bioinformatic methods, which analyze the mutation effects from different points of view. In this review, these approaches are categorized according to their analysis principles. The applicability of these tools for inference of mutation-structure-function relationship is also recapitulated. When the human diseases are likely to involve defects in multiple genes, most of the current mutation analysis focuses on single point mutation and lacks an expansive proteome-wide perspective. We propose in this review the application of the existing computational tools in the analysis of correlated mutations at a system level. Directions for future developments and implications are discussed, which will help to understand the networks underlying human disease.
-
-
-
Proteins as Networks: A Mesoscopic Approach Using Haemoglobin Molecule as Case Study
More LessAuthors: Alessandro Giuliani, Luisa Di Paola and Roberto SetolaProtein structures allow for a straightforward representation in terms of graph theory being the nodes the aminoacid residues and the edges the scoring of a spatial contact between the node pairs. Such a representation allows for a direct use in the realm of protein science of the vast repertoire of graph invariants developed in the analysis of complex networks. In this work we give a general overview of the protein as networks paradigm with a special emphasis on haemoglobin where the most important features of protein systems like allostery, protein-protein contacts and differential effect of mutations were demonstrated to be amenable to a graph theory oriented translation.
-
-
-
Study of Parasitic Infections, Cancer, and other Diseases with Mass-Spectrometry and Quantitative Proteome-Disease Relationships
More LessWe can understand Mass-Spectra Quantitative Proteome-Disease Relationships (MS-QPDRs) as models useful to detect Disease Biomarkers or to predict Drug Toxicity effects based on Mass-Spectra outcomes from samples of human body tissues, parasites, or other organisms. MS-QPDR development and practical use is an emerging area combining Proteomics and Bioinformatics; which involves computational, molecular, and legal sciences. We detect, at least two tendencies on QPDR development. The first tendency (type 1) uses Statistical, Artificial Intelligence, Machine Learning and/or Non-Linear Signal processing to fish for single MS biomarker signals directly within MS data. A recent alternative (type 2) uses Graph Theory to construct Complex Network representations of MS data. Next, we can calculate graph parameters called Mass-Spectra Topological Indices (MS-TIs) useful to describe the graph. The last step is similar to the first tendency but it uses MS-TIs as inputs (instead of MS signals) to seek the MS-QPDR model. There are many examples of QPDR models based on scheme 1. However, there has been little effort to seek QPDR models with scheme 2. On the other hand, MS-QPDR models can be obtained from different body fluids; the case of Human Blood Proteome (BP) is one of the most interesting. The outcomes obtained by Mass Spectrometry (MS) analysis of Serum Protein Profile (SPP) of Blood Proteome (BP) are very useful for the early detection of diseases and drug induced toxicities. In the present work we review, discuss, and outline some perspectives on the use of QPDR models based on the two types of schemes. We also refer to the recent implementation of the internet portal called BioAims for QPDR analysis (http://miaja.tic.udc.es/Bio-AIMS/ ) for free use by the research community.
-
-
-
Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology
More LessWith the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop automated methods for efficiently identifying various attributes of uncharacterized proteins. This is one of the most important tasks facing us today in bioinformatics, and the information thus obtained will have important impacts on the development of proteomics and system biology. To realize that, one of the keys is to find an effective model to represent the sample of a protein. The most straightforward model in this regard is its entire amino acid sequence; however, the entire sequence model would fail to work when the query protein did not have significant homology to proteins of known characteristics. Thus, various non-sequential models or discrete models were proposed. The simplest discrete model is the amino acid (AA) composition. Using it to represent a protein, however, all the sequence-order information would be completely lost. To cope with such a dilemma, the concept of pseudo amino acid (PseAA) composition was introduced. Its essence is to keep using a discrete model to represent a protein yet without completely losing its sequence-order information. Therefore, in a broad sense, the PseAA composition of a protein is actually a set of discrete numbers that is derived from its amino acid sequence and that is different from the classical AA composition and able to harbour some sort of sequence order or pattern information. Ever since the first PseAA composition was formulated to predict protein subcellular localization and membrane protein types, it has stimulated many different modes of PseAA composition for studying various kinds of problems in proteins and proteins-related systems. In this review, we shall give a brief and systematic introduction of various modes of PseAA composition and their applications. Meanwhile, the challenges for finding the optimal PseAA composition are also briefly discussed.
-
-
-
Star Graphs of Protein Sequences and Proteome Mass Spectra in Cancer Prediction
More LessThe impact of cancer in the society has created the necessity of new and faster theoretical models that may allow earlier cancer detection. The present review gives the prediction of cancer by using the star graphs of the protein sequences and proteome mass spectra by building a Quantitative Protein - Disease Relationships (QPDRs), similar to Quantitative Structure Activity Relationship (QSAR) models. The nodes of these star graphs are represented by the amino acids of each protein or by the amplitudes of the mass spectra signals and the edged are the geometric and/or functional relationships between the nodes. The star graphs can be numerically described by the invariant values named topological indices (TIs). The transformation of the star graphs (graphical representation) of proteins into TIs (numbers) facilitates the manipulation of protein information and the search for structure-function relationships in Proteomics. The advantages of this method include simplicity, fast calculations and free resources such as S2SNet and MARCH-INSIDE tools. Thus, this ideal theoretical scheme can be easily extended to other types of diseases or even other fields, such as Genomics or Systems Biology.
-
-
-
Machine Learning Quantitative Structure-Activity Relationships (QSAR) for Peptides Binding to the Human Amphiphysin-1 SH3 Domain
More LessDeveloping machine learning methods to predict peptide-protein binding affinity has become an important approach in proteomics. A diversity of linear and nonlinear machine learning algorithms is applied in quantitative structure- activity relationships (QSAR) to generate predictive models for ligand binding to a biological receptor. QSAR represent regression models that define quantitative correlations between the chemical structure of molecules and their physical, chemical, or biological properties. A QSAR equation predicts a molecular property from a set of molecular descriptors representing the input data to a machine learning algorithm, such as linear regression, partial least squares, artificial neural networks, or support vector machines. Here we present a QSAR comparative study for peptides binding to the human amphiphysin- 1 SH3 domain, based on five machine learning methods, namely partial least squares, radial basis function artificial neural networks, support vector machines, Gaussian processes, k-nearest neighbors, and the decision trees REPTree and M5P, as implemented in the machine learning software Weka. The peptide structure was encoded with five amino acid scales, namely the Miyazawa-Jernigan (MJ) substitution matrix, G. Schneider's principal component (GSPC) scale, Lv's DPPS scale, Clementi's GRID scale, and Wold's z scale. The machine learning models were trained with a dataset of 200 peptides, and the QSAR models were tested for a prediction dataset of 684 peptides. The best predictions were obtained with the decision tree M5P for all five amino acid scales, namely z scale q2 = 0.543, MJ scale q2 = 0.553, GSPC scale q2 = 0.557, GRID scale q2 = 0.558, and DPPS scale q2 = 0.599. These results show that M5P decision trees give predictive QSAR for peptide-protein binding affinity, and should be considered as valuable candidates for other peptide QSAR. Also, the new DPPS scale has clear advantages compared to the previous amino acid descriptors. The study provides support to QSAR approaches based on a large-scale evaluation of machine learning algorithms and diverse classes of structural descriptors.
-
Volumes & issues
-
Volume 21 (2024)
-
Volume 20 (2023)
-
Volume 19 (2022)
-
Volume 18 (2021)
-
Volume 17 (2020)
-
Volume 16 (2019)
-
Volume 15 (2018)
-
Volume 14 (2017)
-
Volume 13 (2016)
-
Volume 12 (2015)
-
Volume 11 (2014)
-
Volume 10 (2013)
-
Volume 9 (2012)
-
Volume 8 (2011)
-
Volume 7 (2010)
-
Volume 6 (2009)
-
Volume 5 (2008)
-
Volume 4 (2007)
-
Volume 3 (2006)
-
Volume 2 (2005)
-
Volume 1 (2004)
Most Read This Month