Current Bioinformatics - Volume 10, Issue 5, 2015
Volume 10, Issue 5, 2015
-
-
Regulative Role of Atomic Auto Correlated Electronegativities and Polarizabilities in β2 Potency of Ultralong Acting Agonists Identified in QSAR Studies
QSAR models supervised by Multiple linear regressions (MLR) and Gaussian kernel support vector machines were developed to predict β2 potency for Sibenadet (Viozan™) and its derivatives along with established LABAs (Formeterol, Salmetrol) and ultra LABA Indacaterol. MLR aided linear QSAR models identified four molecular descriptors MATS6e, GATS5e, Mor17p, R7m+ related to β2 potency while descriptors like R5p+, Lop, Belp4, RDF075m were deduced in prediction of β2 potency in non-linear SVM models. Although, statistical fitness was observed with Gaussian Kernel function SVM models in potency prediction, MLR models proved to be more consistent in predictions. Further MLR and SVM models were statistically validated by internal validation methods like R2CV, RSS and MSS etc. Mechanistic study on linear QSAR models revealed regulative role of atomic autocorrelated electronegativities and polarizabilities in influencing β2 potency.
-
-
-
Virtual Screening of Alkaloids from Apocynaceae with Potential Antitrypanosomal Activity
Chagas' disease, which occurs particularly in South America is a human tropical parasitic disease, caused by Trypanosoma cruzi. A virtual screening in an in-house databank (SISTEMATX), of 469 Apocynaceae indole alkaloids, using models developed with fragment descriptors using Support Vector Machines (SVM) and Decision Trees (DT) were performed. A dataset 545 agrochemicals selected from ChEMBL database was used to generate both models and the prediction performance was tested using a small set of 44 alkaloids with the antitrypanosomal activity. From 469 Apocynaceae alkaloids, the SVM model selected, as actives, 5 similar alkaloids, from 2 species of the Aspidosperma genus (excelsum, marcgravianum), and the DT model selected 3 alkaloids from 3 species (gilbertii, nigracans, and subincanum) of the same genera from the SISTEMATX database. The values of Moriguchi octanol-water partition coefficient for these structures are between 2.3 to 5.3, and 5 alkaloids, passed the Lipinski alert index filter and Drug Like Score consensus (> 0.7), which indicate that these compounds are good candidates to become a drug. These structures might be an interesting starting point for antitrypanosomal studies. The methodology, applying fragment descriptors and machine learning, was rapid and can be applied for virtual screening for bigger databases.
-
-
-
Multi-Criteria Decision Making: the Best Choice for the Modeling of Chemicals against Hyper-Pigmentation?
Classifier ensembles appeared to be powerful alternative for handling a difficult problem. It is rapidly growing and enjoying many attentions from pattern recognition and machine learning communities. In the present report, the potential of multi-criteria decision making via multiclassifier approaches is assessed by applying them in the modeling of chemicals against hyper-pigmentation. TOMOCOMD-CARDD atom-based quadratic indices are used as descriptors to parameterize the molecular structures. Support vector machine, artificial neural network, Bayesian network, binary logistic regression, instance-based learning and tree classification applied on two collected datasets are explored as standalone classifiers. Prediction sets (PSs) are used to assess the performance of multiclassifier systems (MCSs). A strategy exploiting the principal component analysis together with pairwise diversity measures is designed to select the most diverse base classifiers to combine. Various trainable and nontrainable systems are developed that aggregate, at the abstract and continuous levels, the outputs of base classifiers. The obtained results are rather encouraging since the MCSs generally enhance the performance of the base classifiers; e.g. the best MCS obtains global accuracy of 95.51%, 88.89% in the PS for the data I and II in regard to 94.12% and 85.93% of best individual classifier, respectively. Our results suggest that the MCSs could be the best choice till the moment to obtain suitable QSAR models for the prediction of depigmenting agents. Finally, we consider these approaches will aid improving the virtual screening procedures and increasing the practicality of data mining of chemical datasets for the discovery of novel lead compounds.
-
-
-
Optimum Search Strategies or Novel 3D Molecular Descriptors: is there a Stalemate?
The present manuscript describes a novel 3D-QSAR alignment free method (QuBiLS-MIDAS Duplex) based on algebraic bilinear, quadratic and linear forms on the kth two-tuple spatial-(dis)similarity matrix. Generalization schemes for the inter-atomic spatial distance using diverse (dis)-similarity measures are discussed. On the other hand, normalization approaches for the two-tuple spatial-(dis)similarity matrix by using simple- and double-stochastic and mutual probability schemes are introduced. With the aim of taking into consideration particular inter-atomic interactions in total or local-fragment indices, path and length cut-off constraints are used. Also, in order to generalize the use of the linear combination of atom-level indices to yield global (molecular) definitions, a set of aggregation operators (invariants) are applied. A Shannon’s entropy based variability study for the proposed 3D algebraic form-based indices and the DRAGON molecular descriptor families demonstrates superior performance for the former. A principal component analysis reveals that the novel indices codify structural information orthogonal to those captured by the DRAGON indices. Finally, a QSAR study for the binding affinity to the corticosteroidbinding globulin using Cramer’s steroid database is performed. From this study, it is revealed that the QuBiLS-MIDAS Duplex approach yields similar-to-superior performance statistics than all the 3D-QSAR methods reported in the literature reported so far, even with lower degree of freedom, using both the 31 steroids as the training set and the popular division of Cramer’s database in training [1-21] and test sets [22-31]. It is thus expected that this methodology provides useful tools for the diversity analysis of compound datasets and high-throughput screening structure–activity data.
-
-
-
Review of Structures Containing Fullerene-C60 for Delivery of Antibacterial Agents. Multitasking model for Computational Assessment of Safety Profiles
Authors: Valeria V. Kleandrova, Feng Luan, Alejandro Speck-Planche and M.N.D.S. CordeiroFullerenes are carbon allotropes, and they have called the attention of scientists in the last 15 years. In nanotechnology, fullerenes have had several promising applications in medicinal chemistry, pharmaceutical sciences, biomedicine, and related disciplines. Particularly, the design and biological evaluation of fullerene-C60 derivatives as antimicrobial agents constitute essential components of several active areas of research that continue to grow. There is a serious concern due to the emergence of resistance of pathogens to current antibiotics, and consequently, the task of finding new and more efficient antimicrobial therapies is increasingly challenging. This review is devoted to discuss the most recent advances in the discovery of structures containing fullerene-C60 as models of nanoentities-based antibacterial agents. In addition, by considering the role of the toxicity associated to the nanoparticles, we introduce a general multitasking model for quantitative-structure biological effect relationships (mtk-QSBER). This model was created from a heterogeneous dataset containing more than 47200 statistical cases, and it was focused on performing simultaneous predictions of multiple ADMET (absorption, distribution, metabolism, elimination) properties. The mtk-QSBER model could correctly classify more than 90% of the cases in the whole database, being employed for virtual screening of diverse ADMET profiles of different molecular architectures containing fullerene-C60. The theoretical results were in agreement with the experimental evidences, confirming that the increment in the number of polar regions associated to fullerene-C60 can improve the safety profiles. At the same time, this fact demonstrated the ability of the present mtk-QSBER model to be used as an efficient tool for in silico assessment of different safety profiles of large libraries of compounds under dissimilar experimental conditions.
-
-
-
Machine Learning for Prediction of HIV Drug Resistance: A Review
By Isis BonetSeveral antiretroviral drugs have been approved for use in HIV infected people. Despite efforts made by the scientific community, an effective drug that kills the virus has not been developed yet. A lot of computational algorithms have been used for finding mutations associated with drug resistance as well as for prediction of HIV resistance. This article provides an overview of machine learning techniques used to predict the HIV drug resistance. The different types of studies done will be reviewed through the following characteristics: different representations of the problem, ARVs, methods to reduce dimensionality and algorithms of machine learning used.
-
-
-
Identification of Uptake Mechanism of Cell-Penetrating Peptides by their Polar Profile
In the recent years, so-called cell-penetrating peptides (CPPs) have been in constant study due to its ability to penetrate cell membranes. CCPs are characterized by a length of less than 60 amino acids, its highly cationic nature and by a positive net charge at neutral pH. For the CPPs either an endocytic or non-endocytic uptake mechanism has been identified. This work presents the computational polarity index method that is able to predict the uptake mechanism of CPPs with an accuracy of 72% in a double-blind test. This was achieved by reading the peptide sequence and measuring the polarity as one single physico-chemical property. The method was verified by extracting all peptides from the CPPsite database (April 21, 2014) and its efficiency was tested with seven specialized databases of peptides and proteins.
-
-
-
Identification of Azo Dye Degrading Sphingomonas Strain EMBS022 and EMBS023 Using 16S rRNA Gene Sequencing
Azo dyes form substantial industrial pollutants owing to their poor biodegradation capacity. Present study identifies azo dye detoxifying strain of bacteria from waste water near textile industries. In order to identify the azo dye degrading strain, 16S rRNA gene was sequenced from the pure bacterial culture obtained from the samples collected from textile industry area at Erode in Tamil Nadu state of India. Two novel azo dye degrading bacteria were found out of 60 samples investigated, which were respectively named Sphingomonas sp strain EMBS022 and Sphingomonas sp strain EMBS023. Isolates of Sphingomonas sp strains - EMBS022 and EMBS023 were deposited in GenBank with accession numbers ‘KF951596’ & ‘KF951597’respectively. UNAFOLD and RNA fold web servers were employed to predict the secondary structure of 16s RNA of these strains. Free energy estimate for secondary structures of 16s rRNA of strains EMBS022 and EMBS023 were ΔG = -159.00 kcal/mol and ΔG = -159.90 kcal/mol, confirming the structures to be considerably stable.
-
-
-
Graph-Based Processing of Macromolecular Information
The complex information encoded into the element connectivity of a system gives rise to the possibility of graphical processing of divisible systems by using the Graph theory. An application in this sense is the quantitative characterization of molecule topologies of drugs, proteins and nucleic acids, in order to build mathematical models as Quantitative Structure - Activity Relationships between the molecules and a specific biological activity. These types of models can predict new drugs, molecular targets and molecular properties of new molecular structures with an important impact on the Drug Discovery, Medicinal Chemistry, Molecular Diagnosis, and Treatment. The current review is focused on the mathematical methods to encode the connectivity information in three types of graphs such as star graphs, spiral graphs and contact networks and three in-house scientific applications dedicated to the calculation of molecular graph topological indices such as S2SNet, CULSPIN and MInD-Prot. In addition, some examples are presented, such as results of this methodology on drugs, proteins and nucleic acids, including the Web implementation of the best molecular prediction models based on graphs.
-
-
-
Bioinformatics Tool to Identify Peptides Associated to Cancer Cells
Authors: Carlos Polanco, Thomas Buhse and Jose Lino SamaniegoWe present a computational-mathematical algorithm that can identify peptides that are experimentally associated with their action against cancer cells and classified in the APD2 database. The algorithm, named polarity index method, showed an accuracy of 95% in a double-blind test applied to peptides from eight different databases. The method only requires the primary peptide structure, i.e. the amino acid sequence, to determine the polarity profile. Formerly, we have used this method to identify selective antibacterial peptides with a high efficiency. Our present study suggests that this computational method can also be used as a first filter in the analysis and identification of peptides and proteins that are related to cancer cells.
-
-
-
Multiscale Mapping of AIDS in U.S. Countries vs Anti-HIV Drugs Activity with Complex Networks and Information Indices
More LessIn this work, we reviewed different aspects about the epidemiology, drugs, targets, chem-bioinformatics, and systems biology methods, related to AIDS/HIV. Next, we developed a new model to predict complex networks of the prevalence of AIDS in U.S. counties taking into consideration the values of Gini coefficients of social income inequality. We also used activity/structure data of anti-HIV drugs in preclinical assays. First, we trained different Artificial Neural Networks (ANNs) using as input Markov and Symmetry information indices of social networks and of molecular graphs. We obtained the data about AIDS prevalence and Gini coefficient from the AIDSVu database of Emory University and the data about anti-HIV drugs from ChEMBL database. We used Box-Jenkins operators to measure the shift with respect to average behavior of counties from states and drugs from reference compounds assayed in a given protocol, target, or organism. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2310 counties in U.S. vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found was a Linear Neural Network (LNN) with Accuracy, Specificity, Sensitivity, and AUROC above 0.72-0.73 in training and external validation series. The new linear equation was shown to be useful to generate complex network maps of drug activity vs AIDS/HIV epidemiology in U.S. at county level.
-
-
-
MIANN Models of Networks of Biochemical Reactions, Ecosystems, and U.S. Supreme Court with Balaban-Markov Indices
Authors: Aliuska Duardo-Sanchez, Humberto Gonzalez-Diaz and Alejandro PazosWe can use Artificial Neural Networks (ANNs) and graph Topological Indices (TIs) to seek structure-property relationship. Balabans’ J index is one of the classic TIs for chemo-informatics studies. We used here Markov chains to generalize the J index and apply it to bioinformatics, systems biology, and social sciences. We seek new ANN models to show the discrimination power of the new indices at node level in three proof-of-concept experiments. First, we calculated more than 1,000,000 values of the new Balaban-Markov centralities Jk(i) and other indices for all nodes in >100 complex networks. In the three experiments, we found new MIANN models with >80% of Specificity (Sp) and Sensitivity (Sn) in train and validation series for Metabolic Reactions of Networks (MRNs) for 42 organisms (bacteria, yeast, nematode and plants), 73 Biological Interaction Webs or Networks (BINs), and 43 sub-networks of U.S. Supreme court citations in different decades from 1791 to 2005. This work may open a new route for the application of TIs to unravel hidden structure-property relationships in complex bio-molecular, ecological, and social networks.
-
-
-
A Hybrid Evolutionary System for Automated Artificial Neural Networks Generation and Simplification in Biomedical Applications
Data mining and data classification over biomedical data are two of the most important research fields in computer science. Among the great diversity of technique that computer science can use for this purpose, Artificial Neural Networks (ANNs) are one of the most suited. One of the main problems in the development of this technique, ANNs, is the slow performance of the full process. Traditionally, in this development process, human experts are needed to experiment with different architectural procedures until they find the one that presents the correct results for solving a specific problem. However, recently, many different studies have emerged in which different ANN developmental techniques, more or less automated, are described, all of them having several pros and cons. In this paper, the authors have focused to develop a new technique to perform this process over biomedical data. The new technique is described in which two Evolutionary Computation (EC) techniques are mixed in order to automatically develop ANNs. These techniques are Genetic Algorithms (GAs) and Genetic Programming (GP). The work goes further, and the system described here allows the obtaining of simplified networks with a low number of neurons for resolving the problems adequately. Those already existing systems that use EC for ANN development are compared with the system proposed here. For this purpose, some of the most frequently biomedical databases have been used in order to measure the behaviour of the system and also to compare the results obtained with other ANN generation and training methods with EC tools. The authors have also used other databases that are frequently used to compare this kind of method in order to obtain a more general view of the new system’s performance. The conclusions reached from these comparisons indicate that this new system produces very good results, which in the worst case are at least comparable to existing techniques and in many cases are substantially better. Furthermore, the system has other features like variable selection. This last feature is able to discover new knowledge about the problems being solved.
-
-
-
MI-NODES Multiscale Models of Metabolic Reactions, Brain Connectome, Ecological, Epidemic, World Trade, and Legal-Social Networks
Authors: Aliuska Duardo-Sanchez, Humberto Gonzalez-Diaz and Alejandro PazosComplex systems and networks appear in almost all areas of reality. We find then from proteins residue networks to Protein Interaction Networks (PINs). Chemical reactions form Metabolic Reactions Networks (MRNs) in living beings or Atmospheric reaction networks in planets and moons. Network of neurons appear in the worm C. elegans, in Human brain connectome, or in Artificial Neural Networks (ANNs). Infection spreading networks exist for contagious outbreaks networks in humans and in malware epidemiology for infection with viral software in internet or wireless networks. Social-legal networks with different rules evolved from swarm intelligence, to hunter-gathered societies, or citation networks of U.S. Supreme Court. In all these cases, we can see the same question. Can we predict the links based on structural information? We propose to solve the problem using Quantitative Structure-Property Relationship (QSPR) techniques commonly used in chemo-informatics. In so doing, we need software able to transform all types of networks/graphs like drug structure, drug-target interactions, protein structure, protein interactions, metabolic reactions, brain connectome, or social networks into numerical parameters. Consequently, we need to process in alignment-free mode multitarget, multiscale, and multiplexing, information. Later, we have to seek the QSPR model with Machine Learning techniques. MI-NODES is this type of software. Here we review the evolution of the software from chemoinformatics to bioinformatics and systems biology. This is an effort to develop a universal tool to study structure-property relationships in complex systems.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
