Current Bioinformatics - Volume 10, Issue 3, 2015
Volume 10, Issue 3, 2015
-
-
A New Procedure to Analyze RNA Non-Branching Structures
Authors: Giulia Fiscon, Paola Paci, Teresa Colombo and Giulio IannelloRNA structure prediction and structural motifs analysis are challenging tasks in the investigation of RNA function. We propose a novel procedure to detect structural motifs shared between two RNAs (a reference and a target). In particular, we developed two core modules: (i) nbRSSP_extractor, to assign a unique structure to the reference RNA encoded by a set of non-branching structures; (ii) SSD_finder, to detect structural motifs that the target RNA shares with the reference, by means of a new score function that rewards the relative distance of the target non-branching structures compared to the reference ones. We integrated these algorithms with already existing software to reach a coherent pipeline able to perform the following two main tasks: prediction of RNA structures (integration of RNALfold and nbRSSP_extractor) and search for chains of matches (integration of Structator and SSD_finder).
-
-
-
On Evolutionary Algorithms for Biclustering of Gene Expression Data
Authors: A. Carballido Jessica, A. Gallo Cristian, S. Dussaut Julieta and Ponzoni IgnacioPast decades have seen the rapid development of microarray technologies making available large amounts of gene expression data. Hence, it has become increasingly important to have reliable methods to interpret this information in order to discover new biological knowledge. In this review paper we aim to describe the main existing evolutionary methods that analyze microarray gene expression data by means of biclustering techniques. Strategies will be classified according to the evaluation metric used to quantify the quality of the biclusters. In this context, the main evaluation measures, namely mean squared residue, virtual error and transposed virtual error, are first presented. Then, the main evolutionary algorithms, which find biclusters in gene expression data matrices using those metrics, are described and compared.
-
-
-
Flux Balance Analysis and Thermodynamics: Trends and Strategies in Disease Biology
Authors: Tarika Vijayaraghavan and Somnath TagoreComputational systems biology emphasizes on modeling biological systems mathematically and computationally for analyzing their role in solving complex processes. One of the applications of systems biology is understanding disease networks, for which various strategies such as, decisionmaking stochastic networks, graph theory based methods, robust control theory, elementary flux modes, extreme pathways, convex analysis, flux variability analysis and phenotype phase planes are used. For judging the stability of a metabolic network, a sound knowledge of bioenergetics as well as free energy concepts regarding metabolic systems are essential. In this manuscript, we have explored the possible application of flux balance analysis in disease networks in correspondence with group contribution method pertaining to the field of thermodynamics for studying the stability of biological systems.
-
-
-
A Review and Comparative Assessment of Machine Learning Approaches for Interaction Site Prediction in Membrane Proteins
Authors: Ebrahim Barzegari Asadabadi and Parviz AbdolmalekiProtein-protein interactions at membranes play an inevitable role in the function of proteins there. However, the task of studying the interactions is of many difficulties in case of the membrane proteins, due to their hydrophobic environment. This is why the use of bioinformatics methods, especially machine learning approaches is essential to understand the membrane proteins' function. However, the machine learning methods for prediction of the interaction sites of membrane proteins are faced with numerous challenges. This paper aims to describe the current state of machine learning applications in inferring the membrane protein interaction sites, and to assess the methods up to now proposed. We have introduced the membrane protein interaction site prediction methods, presented the contribution degrees of parameters at membrane protein interaction interfaces, and compared the performance of methods through several case predictions.
-
-
-
MASS KIR Analyzer: An In Silico Approach for Analyzing the Killer Immunoglobulin Receptor Gene Content and its Diversity
Authors: Manmohan Pandey, Aditya Narayan Sarangi, Swayam Prakash and Suraksha AgrawalIntroduction: This is a first study where user friendly software (MASS KIR Analyzer) has been presented to analyze the Killer Immunoglobulin-like receptors (KIR) gene content diversity. The main objective of this software is to generate error free KIR genotyping data. Materials and Methods: The MASSKIR Analyzer accepts KIR genotyping data in binary format. It has the potential to determine KIR genotype, haplotype and linkage group frequencies and assignment of genotypic ID for each KIR profile as suggested by the Allele Frequency Net database. The software includes linkage disequilibrium analysis based on Lewontin, Mattiuz and Cramer’s V statistic’s principle. Genotype, haplotype and linkage group modules of the software were validated based on KIR genotyping data (n=12,481) available at the Allele Frequency Net database. Genetic distance and heatmap modules of the software were validated based on KIR data (n=976) from the Human Genome Diversity Project-Centre d'Etude du Polymorphisme Humain (HGDP-CEPH) database. Results: The MASS KIR Analyzer software detected 225 genotypic errors in the KIR dataset reported to the Allele Frequency Net database. The Linkage Disequilibrium analysis revealed negative linkage disequilibrium between KIR3DS1-3DL1(p=<0.001) and KIR2DL2-2DL1(p=<0.001) genes and positive linkage disequilibrium between KIR2DS2-2DL2 (p=<0.001) genes. Genetic distances calculated for the HGDP-CEPH KIR populations, revealed similar results in lieu of published data for clustered heatmap. This validated proper functioning of the software for KIR data analysis. Conclusion: The MASSKIR Analyzer is a promising software for KIR biologists and may significantly improve the KIR genotyping without errors.
-
-
-
GAOPP: Operon Prediction in Prokaryotes Using Genetic Algorithm
Authors: Kanhu C. Moharana, Manas R. Dikhit, Bikash R. Sahoo, Ganesh C. Sahoo and Pradeep DasOperons, consisting of functionally related or correlated genes, are commonly found in prokaryotes and have enormous importance in understanding microbial genomics. To identify such genes, we developed a stand-alone tool (GAOPP) for predicting operons utilizing Genetic Algorithm approach. We also aimed to use minimum and easily available data set as input to obtain modest accuracy, and provide a graphical user interface for easy accessibility. Prediction relies completely on a single input (.ptt) file, but the optional pathway file form KEGG can increase the accuracy. The tool was successfully tested on the set of experimentally defined operons in E. coli using three different fitness functions, namely Fuzzy Guided Scoring System, Rules Guided Scoring Scheme and Bayesian Scoring Scheme based Particle Swarm Optimization, with accuracies of 89.1, 86.5 and 93.4%, respectively. The availability of different fitness functions also enhances the importance of GAOPP. This tool will be helpful in predicting operons in newly sequenced prokaryotes. The tool is freely available at http://biomedinformri.com/gaopp/
-
-
-
Quantifying Gene Co-Expression Heterogeneity in Cancer Towards Efficient Network Biomarker Design
Authors: Shang Gao, Abdullah Sarhan, Reda Alhajj, Jon Rokne, Doug Demetrick and Jia ZengIt is well known that cancer is a highly heterogeneous disease, and the predictive capability of targeted gene signature approach suffers from the inter-tumor heterogeneity. Here we propose a framework to quantify the molecular heterogeneity of tumors from gene-gene relational perspective using co-expression networks and interactome data. We believe that to understand individualized gene behavior across patients, relational status of genes needs to be considered because complex disease phenotype is often caused by failures of genetic interactions in cancer cells. We quantified gene-gene relational heterogeneity from a benchmark data set using co-expression networks inferred from Microarray data, and showed that genes related to breast cancer metastasis can be stratified to different classes based on their relational status obtained from pair-wise comparisons of co-expression networks. Further we used the relational heterogeneity information to predict patient survival and found that relationally heterogeneous gene set is less predictive than relatively conserved cancer genes. We explored heterogeneity gene sets using interactome data and identified densely connected components that are causal to inter-tumor heterogeneity. We independently validated our approach with two patient cohorts. Our results demonstrated the efficiency of using heterogeneity information to design network markers.
-
-
-
A Hybrid Approach Based on Pattern Recognition and BioNLP for Investigating Drug-Drug Interaction
Authors: Rabia Javed, Saima Farhan and Salman HumdullahIn the field of drug research and development, the investigation of drug-drug interactions (DDIs) is a vital research area. Many clinical tools are used in the industry providing the broad lists of DDIs. But these tools can return unpredictable results, only limited to specific types of interaction. Also these tools lack the synchronized database of drug-drug interaction. In our research work, we are proposing a novel pattern recognition based technique that investigates the patterns of natural language processing for extraction of drug-drug interaction. The proposed technique is novel in the sense that is uses a smaller feature set, computationally less expensive, yet yielding better results as compared to previous techniques. The proposed technique is based on Bioinformatics and NLP (Natural Language processing). For this research paper, we have collected biomedical data such as drug names, drug identification numbers (ID’s) and different types of drug-drug interaction sentences from DrugBank(a free online resource). Through parsing the sentences, we investigated the patterns for drugdrug interaction extraction. For performance evaluation of our proposed work, we applied three different types of classification models i.e. Naïve Bayes, J48 (decision tree) and Random Forest. The results achieved by our technique are: F-score 82.4%, Precision 81.6% and Recall 83.2%. The best accuracy achieved is with Random Forest, which is 99.0%. Comparison with previous research shows that our proposed technique provides better results.
-
-
-
Dualpred: A Webserver for Predicting Plant Proteins Dual-Targeted to Chloroplast and Mitochondria Using Split Protein-Relatedness-Measure Feature
Authors: Vijayakumar Saravanan and Palanisamy Thanga Velan LakshmiPlant cell contains two major cell organelles, chloroplast and mitochondria, which play key roles in energy metabolism as well as in regulating a number of prominent processes. Also, proteins that are located in both chloroplast and mitochondria presumably function distinctly in both locations, and therefore the knowledge about the localization of protein is vital. Hence, a webserver (DualPred) is designed to predict the plant dual-targeted proteins (chloroplast and mitochondria) using novel split protein-relatedness-measure feature and AdaBoost-J48 as a classifier. DualPred adopts two-layer prediction for distinguishing plant proteins dual-targeted to chloroplast and mitochondria from other localized proteins. DualPred was rigorously trained and tested with different benchmark datasets and newly developed independent dataset. Statistical techniques including K-fold cross-validation, detailed ROC analysis, Mathew’s correlation, and area under ROC curves were conducted to assess the performance of DualPred. DualPred achieved an overall accuracy of 85.0% and 91.9% in a 10-fold cross validation on the new DT167 dataset and benchmark dataset, respectively in predicting dual-targeted protein. Validation with the independent dataset (DT167 as model and benchmark dataset as model) achieved overall accuracy of 89.2% and 86.8%, respectively. Also, the Mathew’s correlation and area under ROC curves for the classifiers on different datasets were found to be significant. Hence, based on the results of various validation tests it is evident that the novel feature representation was effective in distinguishing the plant proteins dual-targeted to chloroplast and mitochondria from other localized proteins. DualPred, the web server implementation of the algorithm written in PERL could be accessed freely through http://pcmpred.bicpu.edu.in/predict. php.
-
-
-
An Improved Mathematical Object for Graphical Representation of DNA Sequences
More LessAn improved mathematical object is presented for analyzing the similarity and dissimilarity of DNA sequences. The kernel of it is to employ the changes of the x-coordinates and y-coordinates of the points corresponding to the bases of DNA sequence in the method of graphical representation, respectively. Compared with the traditional methods based on the distances between points, a more detailed consideration on the measurement between points is made in this new method, so more information of DNA sequence is exploited. It can distinguish any form of symmetrical curves corresponding to DNA sequences, and this advantage can hardly be achieved by traditional methods. Furthermore, the contrast experiments show that this method increases the precision by about 50%.
-
-
-
Neighboring-Site Effects of Amino Acid Substitutions in the Mouse Genome
Authors: Mingchuan Fu, Hongxia Pang, Jian Cheng and Shiheng TaoNucleotide evolution models benefit a lot from the reported neighbor-dependent nucleotide mutations. Investigations of neighboring-site effects of amino acid substitutions may also promote the development of protein evolution models. Here, the neighboring-site effects of amino acid substitutions in the mouse genome are evaluated by grouping the 20 amino acids into four categories: nonpolar neutral (NON), polar neutral (NEU), positive (POS) and negative (NEG) amino acids. Our data indicate that amino acid substitutions are evidently neighboring-site dependent, and the most prominent bias is the NEG→NEG substitution occurring in NEG_NEG context, the frequency of which is 2-fold higher than that of expectation. The neighboring-site effects are also correlated with some types of protein secondary structures. Through this study, we conclude that like neighbor-dependent nucleotide mutations, amino acid substitutions are also neighboring-site dependent. The mutation bias of nucleotide sequence and natural or functional selection on protein structure might be two underlying reasons for the neighboring-site effects of amino acid substitutions in the mouse genome.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
