Volume 10, Issue 3

Current Bioinformatics - Volume 10, Issue 3, 2015

Volume 10, Issue 3, 2015

- Meet Our Editorial Board Member:
  
  By Stefano Toppo
  
  https://doi.org/10.2174/157489361003150723130409
  More Less
  
  Add to my favourites
  
  Email this

- A New Procedure to Analyze RNA Non-Branching Structures
  
  Authors: Giulia Fiscon, Paola Paci, Teresa Colombo and Giulio Iannello
  
  https://doi.org/10.2174/1574893609666140820224651
  More Less
  
  RNA structure prediction and structural motifs analysis are challenging tasks in the investigation of RNA function. We propose a novel procedure to detect structural motifs shared between two RNAs (a reference and a target). In particular, we developed two core modules: (i) nbRSSP_extractor, to assign a unique structure to the reference RNA encoded by a set of non-branching structures; (ii) SSD_finder, to detect structural motifs that the target RNA shares with the reference, by means of a new score function that rewards the relative distance of the target non-branching structures compared to the reference ones. We integrated these algorithms with already existing software to reach a coherent pipeline able to perform the following two main tasks: prediction of RNA structures (integration of RNALfold and nbRSSP_extractor) and search for chains of matches (integration of Structator and SSD_finder).
  
  Add to my favourites
  
  Email this

- On Evolutionary Algorithms for Biclustering of Gene Expression Data
  
  Authors: A. Carballido Jessica, A. Gallo Cristian, S. Dussaut Julieta and Ponzoni Ignacio
  
  https://doi.org/10.2174/1574893609666140829204739
  More Less
  
  Past decades have seen the rapid development of microarray technologies making available large amounts of gene expression data. Hence, it has become increasingly important to have reliable methods to interpret this information in order to discover new biological knowledge. In this review paper we aim to describe the main existing evolutionary methods that analyze microarray gene expression data by means of biclustering techniques. Strategies will be classified according to the evaluation metric used to quantify the quality of the biclusters. In this context, the main evaluation measures, namely mean squared residue, virtual error and transposed virtual error, are first presented. Then, the main evolutionary algorithms, which find biclusters in gene expression data matrices using those metrics, are described and compared.
  
  Add to my favourites
  
  Email this

- Flux Balance Analysis and Thermodynamics: Trends and Strategies in Disease Biology
  
  Authors: Tarika Vijayaraghavan and Somnath Tagore
  
  https://doi.org/10.2174/157489361003150723131916
  More Less
  
  Computational systems biology emphasizes on modeling biological systems mathematically and computationally for analyzing their role in solving complex processes. One of the applications of systems biology is understanding disease networks, for which various strategies such as, decisionmaking stochastic networks, graph theory based methods, robust control theory, elementary flux modes, extreme pathways, convex analysis, flux variability analysis and phenotype phase planes are used. For judging the stability of a metabolic network, a sound knowledge of bioenergetics as well as free energy concepts regarding metabolic systems are essential. In this manuscript, we have explored the possible application of flux balance analysis in disease networks in correspondence with group contribution method pertaining to the field of thermodynamics for studying the stability of biological systems.
  
  Add to my favourites
  
  Email this

- A Review and Comparative Assessment of Machine Learning Approaches for Interaction Site Prediction in Membrane Proteins
  
  Authors: Ebrahim Barzegari Asadabadi and Parviz Abdolmaleki
  
  https://doi.org/10.2174/157489361003150723132234
  More Less
  
  Protein-protein interactions at membranes play an inevitable role in the function of proteins there. However, the task of studying the interactions is of many difficulties in case of the membrane proteins, due to their hydrophobic environment. This is why the use of bioinformatics methods, especially machine learning approaches is essential to understand the membrane proteins' function. However, the machine learning methods for prediction of the interaction sites of membrane proteins are faced with numerous challenges. This paper aims to describe the current state of machine learning applications in inferring the membrane protein interaction sites, and to assess the methods up to now proposed. We have introduced the membrane protein interaction site prediction methods, presented the contribution degrees of parameters at membrane protein interaction interfaces, and compared the performance of methods through several case predictions.
  
  Add to my favourites
  
  Email this

- MASS KIR Analyzer: An In Silico Approach for Analyzing the Killer Immunoglobulin Receptor Gene Content and its Diversity
  
  Authors: Manmohan Pandey, Aditya Narayan Sarangi, Swayam Prakash and Suraksha Agrawal
  
  https://doi.org/10.2174/1574893609999140918200855
  More Less
  
  Introduction: This is a first study where user friendly software (MASS KIR Analyzer) has been presented to analyze the Killer Immunoglobulin-like receptors (KIR) gene content diversity. The main objective of this software is to generate error free KIR genotyping data. Materials and Methods: The MASSKIR Analyzer accepts KIR genotyping data in binary format. It has the potential to determine KIR genotype, haplotype and linkage group frequencies and assignment of genotypic ID for each KIR profile as suggested by the Allele Frequency Net database. The software includes linkage disequilibrium analysis based on Lewontin, Mattiuz and Cramer’s V statistic’s principle. Genotype, haplotype and linkage group modules of the software were validated based on KIR genotyping data (n=12,481) available at the Allele Frequency Net database. Genetic distance and heatmap modules of the software were validated based on KIR data (n=976) from the Human Genome Diversity Project-Centre d'Etude du Polymorphisme Humain (HGDP-CEPH) database. Results: The MASS KIR Analyzer software detected 225 genotypic errors in the KIR dataset reported to the Allele Frequency Net database. The Linkage Disequilibrium analysis revealed negative linkage disequilibrium between KIR3DS1-3DL1(p=<0.001) and KIR2DL2-2DL1(p=<0.001) genes and positive linkage disequilibrium between KIR2DS2-2DL2 (p=<0.001) genes. Genetic distances calculated for the HGDP-CEPH KIR populations, revealed similar results in lieu of published data for clustered heatmap. This validated proper functioning of the software for KIR data analysis. Conclusion: The MASSKIR Analyzer is a promising software for KIR biologists and may significantly improve the KIR genotyping without errors.
  
  Add to my favourites
  
  Email this

- GAOPP: Operon Prediction in Prokaryotes Using Genetic Algorithm
  
  Authors: Kanhu C. Moharana, Manas R. Dikhit, Bikash R. Sahoo, Ganesh C. Sahoo and Pradeep Das
  
  https://doi.org/10.2174/157489361003150723134646
  More Less
  
  Operons, consisting of functionally related or correlated genes, are commonly found in prokaryotes and have enormous importance in understanding microbial genomics. To identify such genes, we developed a stand-alone tool (GAOPP) for predicting operons utilizing Genetic Algorithm approach. We also aimed to use minimum and easily available data set as input to obtain modest accuracy, and provide a graphical user interface for easy accessibility. Prediction relies completely on a single input (.ptt) file, but the optional pathway file form KEGG can increase the accuracy. The tool was successfully tested on the set of experimentally defined operons in E. coli using three different fitness functions, namely Fuzzy Guided Scoring System, Rules Guided Scoring Scheme and Bayesian Scoring Scheme based Particle Swarm Optimization, with accuracies of 89.1, 86.5 and 93.4%, respectively. The availability of different fitness functions also enhances the importance of GAOPP. This tool will be helpful in predicting operons in newly sequenced prokaryotes. The tool is freely available at http://biomedinformri.com/gaopp/
  
  Add to my favourites
  
  Email this

- Quantifying Gene Co-Expression Heterogeneity in Cancer Towards Efficient Network Biomarker Design
  
  Authors: Shang Gao, Abdullah Sarhan, Reda Alhajj, Jon Rokne, Doug Demetrick and Jia Zeng
  
  https://doi.org/10.2174/157489361003150723134952
  More Less
  
  It is well known that cancer is a highly heterogeneous disease, and the predictive capability of targeted gene signature approach suffers from the inter-tumor heterogeneity. Here we propose a framework to quantify the molecular heterogeneity of tumors from gene-gene relational perspective using co-expression networks and interactome data. We believe that to understand individualized gene behavior across patients, relational status of genes needs to be considered because complex disease phenotype is often caused by failures of genetic interactions in cancer cells. We quantified gene-gene relational heterogeneity from a benchmark data set using co-expression networks inferred from Microarray data, and showed that genes related to breast cancer metastasis can be stratified to different classes based on their relational status obtained from pair-wise comparisons of co-expression networks. Further we used the relational heterogeneity information to predict patient survival and found that relationally heterogeneous gene set is less predictive than relatively conserved cancer genes. We explored heterogeneity gene sets using interactome data and identified densely connected components that are causal to inter-tumor heterogeneity. We independently validated our approach with two patient cohorts. Our results demonstrated the efficiency of using heterogeneity information to design network markers.
  
  Add to my favourites
  
  Email this

- A Hybrid Approach Based on Pattern Recognition and BioNLP for Investigating Drug-Drug Interaction
  
  Authors: Rabia Javed, Saima Farhan and Salman Humdullah
  
  https://doi.org/10.2174/157489361003150723135136
  More Less
  
  In the field of drug research and development, the investigation of drug-drug interactions (DDIs) is a vital research area. Many clinical tools are used in the industry providing the broad lists of DDIs. But these tools can return unpredictable results, only limited to specific types of interaction. Also these tools lack the synchronized database of drug-drug interaction. In our research work, we are proposing a novel pattern recognition based technique that investigates the patterns of natural language processing for extraction of drug-drug interaction. The proposed technique is novel in the sense that is uses a smaller feature set, computationally less expensive, yet yielding better results as compared to previous techniques. The proposed technique is based on Bioinformatics and NLP (Natural Language processing). For this research paper, we have collected biomedical data such as drug names, drug identification numbers (ID’s) and different types of drug-drug interaction sentences from DrugBank(a free online resource). Through parsing the sentences, we investigated the patterns for drugdrug interaction extraction. For performance evaluation of our proposed work, we applied three different types of classification models i.e. Naïve Bayes, J48 (decision tree) and Random Forest. The results achieved by our technique are: F-score 82.4%, Precision 81.6% and Recall 83.2%. The best accuracy achieved is with Random Forest, which is 99.0%. Comparison with previous research shows that our proposed technique provides better results.
  
  Add to my favourites
  
  Email this

- Dualpred: A Webserver for Predicting Plant Proteins Dual-Targeted to Chloroplast and Mitochondria Using Split Protein-Relatedness-Measure Feature
  
  Authors: Vijayakumar Saravanan and Palanisamy Thanga Velan Lakshmi
  
  https://doi.org/10.2174/1574893609666140226000041
  More Less
  
  Plant cell contains two major cell organelles, chloroplast and mitochondria, which play key roles in energy metabolism as well as in regulating a number of prominent processes. Also, proteins that are located in both chloroplast and mitochondria presumably function distinctly in both locations, and therefore the knowledge about the localization of protein is vital. Hence, a webserver (DualPred) is designed to predict the plant dual-targeted proteins (chloroplast and mitochondria) using novel split protein-relatedness-measure feature and AdaBoost-J48 as a classifier. DualPred adopts two-layer prediction for distinguishing plant proteins dual-targeted to chloroplast and mitochondria from other localized proteins. DualPred was rigorously trained and tested with different benchmark datasets and newly developed independent dataset. Statistical techniques including K-fold cross-validation, detailed ROC analysis, Mathew’s correlation, and area under ROC curves were conducted to assess the performance of DualPred. DualPred achieved an overall accuracy of 85.0% and 91.9% in a 10-fold cross validation on the new DT167 dataset and benchmark dataset, respectively in predicting dual-targeted protein. Validation with the independent dataset (DT167 as model and benchmark dataset as model) achieved overall accuracy of 89.2% and 86.8%, respectively. Also, the Mathew’s correlation and area under ROC curves for the classifiers on different datasets were found to be significant. Hence, based on the results of various validation tests it is evident that the novel feature representation was effective in distinguishing the plant proteins dual-targeted to chloroplast and mitochondria from other localized proteins. DualPred, the web server implementation of the algorithm written in PERL could be accessed freely through http://pcmpred.bicpu.edu.in/predict. php.
  
  Add to my favourites
  
  Email this

- An Improved Mathematical Object for Graphical Representation of DNA Sequences
  
  Authors: Yan Peng and Yuewu Liu
  
  https://doi.org/10.2174/157489361003150723135559
  More Less
  
  An improved mathematical object is presented for analyzing the similarity and dissimilarity of DNA sequences. The kernel of it is to employ the changes of the x-coordinates and y-coordinates of the points corresponding to the bases of DNA sequence in the method of graphical representation, respectively. Compared with the traditional methods based on the distances between points, a more detailed consideration on the measurement between points is made in this new method, so more information of DNA sequence is exploited. It can distinguish any form of symmetrical curves corresponding to DNA sequences, and this advantage can hardly be achieved by traditional methods. Furthermore, the contrast experiments show that this method increases the precision by about 50%.
  
  Add to my favourites
  
  Email this

- Neighboring-Site Effects of Amino Acid Substitutions in the Mouse Genome
  
  Authors: Mingchuan Fu, Hongxia Pang, Jian Cheng and Shiheng Tao
  
  https://doi.org/10.2174/1574893609666140716171245
  More Less
  
  Nucleotide evolution models benefit a lot from the reported neighbor-dependent nucleotide mutations. Investigations of neighboring-site effects of amino acid substitutions may also promote the development of protein evolution models. Here, the neighboring-site effects of amino acid substitutions in the mouse genome are evaluated by grouping the 20 amino acids into four categories: nonpolar neutral (NON), polar neutral (NEU), positive (POS) and negative (NEG) amino acids. Our data indicate that amino acid substitutions are evidently neighboring-site dependent, and the most prominent bias is the NEG→NEG substitution occurring in NEG_NEG context, the frequency of which is 2-fold higher than that of expectation. The neighboring-site effects are also correlated with some types of protein secondary structures. Through this study, we conclude that like neighbor-dependent nucleotide mutations, amino acid substitutions are also neighboring-site dependent. The mutation bias of nucleotide sequence and natural or functional selection on protein structure might be two underlying reasons for the neighboring-site effects of amino acid substitutions in the mouse genome.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 10, Issue 3, 2015

Volume 10, Issue 3, 2015

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed