Volume 11, Issue 5

Current Bioinformatics - Volume 11, Issue 5, 2016

Volume 11, Issue 5, 2016

- Meet Our Regional Editor:
  
  By Ivan Y. Iourov
  
  https://doi.org/10.2174/157489361105161101205455
  More Less
  
  Add to my favourites
  
  Email this

- EDITORIAL (Thematic Issue: Selected Papers from the Annual PACBB Conference 2014: Special Issue)
  
  Authors: Miguel Rocha and Florentino Fdez-Riverola
  
  https://doi.org/10.2174/157489361105161101210018
  More Less
  
  Add to my favourites
  
  Email this

- An HMM-Based Text Classifier Less Sensitive to Document Management Problems
  
  Authors: Adrián S. Vieira, Eva L. Iglesias and Lourdes B. Diz
  
  https://doi.org/10.2174/1574893611666160617094720
  More Less
  
  Background: The performance of the text classification techniques is commonly affected by the characteristics and representation of the document corpora itself. Of all the problems arising from the corpus, there are three major difficulties which the classifiers must deal with: the feature selection issues, the class imbalance problem and the size of the training set. Objective: The objective of this paper is to present a novel based-content text classifier called T-LHMM that is less sensitive to the text representation and the size of the corpus, and more efficient in terms of running time than other classification techniques. Method: In order to demonstrate it, we present a set of experiments performed on well-known biomedical text corpora. We also compare our classifier with k-Nearest Neighbours and Support Vector Machine models. Results and Conclusion: The experimental and statistical results show that the proposed HMM-based text classifier is indeed less sensitive to the class imbalance, the size of the corpus and the vocabulary than the other classifiers. In addition, it is more efficient in terms of running time than k-NN and SVM techniques.
  
  Add to my favourites
  
  Email this

- An Agent-Based Model to Associate Genomic and Environmental Data for Phenotypic Prediction in Plants
  
  Authors: Sebastien Alameda, Jean-Pierre Mano, Carole Bernon and Sebastien Mella
  
  https://doi.org/10.2174/1574893611666160617094329
  More Less
  
  Background: One of the means to increase in-field crop yields is the use of software tools to predict future yield values using past in-field trials and plant genetics. The traditional, statistics-based approaches lack environmental data integration and are very sensitive to missing and/or noisy data. Objective: In this paper, we show that a cooperative, adaptive Multi-Agent System can overcome the drawbacks of such algorithms. Method: The system resolves the problem in an iterative way by a cooperation between the constraints, modelled as agents. Results: Results show that the Agent-Based Model gives results comparable to other approaches, without having to preprocess or reconcile data. Conclusion: This collective and self-adaptive search of a solution functions like a heuristic to efficiently explore the solution space and is therefore able to consider both genetic and environmental data.
  
  Add to my favourites
  
  Email this

- Reconstruction of the Network of Experimentally Validated AMP-Drug Combinations Against Pseudomonas aeruginosa Infections
  
  Authors: Paula Jorge, Martín Pérez-Pérez, Gael Pérez Rodríguez, Florentino Fdez-Riverola, Maria Olivia Pereira and Anália Lourenço
  
  https://doi.org/10.2174/1574893611666160617093955
  More Less
  
  Background: The combination of antimicrobial products is a promising biomedical strategy against the ever growing number of resistant strains emerging in healthcare and community settings. Agents with alternative modes of action, such as antimicrobial peptides (AMPs), and the efficacy of combined actions are being evaluated. Despite the availability of various antimicrobial data repositories, a wealth of information remains scattered through the scientific literature. Objective: The aim of this work is to provide a global view of available interaction data and help design new antimicrobial studies. Method: We implemented an automated curation pipeline to produce the first ever network reconstruction of AMP-drug combinations. Results: This network relates to antimicrobial combinations experimentally tested against Pseudomonas aeruginosa infections and includes 239 combinations among AMPs and other antimicrobials. Conclusions: Reconstruction is on-going, coping with new experimental results for P. aeruginosa, and will be soon extended to other meaningful microbial pathogens. The network is publicly accessible at http://sing.ei.uvigo.es/antimicrobialCombination/.
  
  Add to my favourites
  
  Email this

- Biological Network Derivation by Positive Unlabeled Learning Algorithms
  
  Authors: Doruk Pancaroglu and Mehmet Tan
  
  https://doi.org/10.2174/1574893611666160617093509
  More Less
  
  Background: In cases where only a single group (or class) of samples is available for a given problem, positive unlabeled learning algorithms can be applied. One such case is the interactions between various biological/chemical entity pairs, where only the set of interacting entities can be collected, not the “noninteracting” ones. Objective: We aim to improve the performance of deriving protein-protein and protein-ligand interactions. We argue that the positive-unlabeled learning algorithms can be applied to this problem. Method: In this paper, we propose some modifications to two of the existing methods for protein-protein and protein-ligand interaction network derivation. First, we extend the algorithms to use Random Forests and then we devise an ensemble classifier from these two based on voting. Results: We report the evaluation results of the proposed algorithms in comparison to the original methods and well-known biological network derivation algorithms. We achieved significant improvements in terms of different metrics. Conclusion: The results are promising in the sense that proposed methods either perform competitively or better than previous methods. This motivates us in applying the proposed methods to other data sets and similar problems.
  
  Add to my favourites
  
  Email this

- Reverse Vaccinology: An Epitope Based Approach to Design Vaccines
  
  Authors: Mehak Dangi, Bharat Singh and Anil K. Chhillar
  
  https://doi.org/10.2174/1574893610666150828193015
  More Less
  
  Background: It has been noticed that the conventional approaches of vaccinology are highly successful to bring down the rate of disease incidences of life threatening infections in 20th century. But these approaches certainly suffer with some delimitation. Such as they take decades to unravel pathogens and antigens related to a disease and even then only few but not all of the antigens are disclosed. Therefore, persistent need of a revolution in this field was required which certainly appeared a decade before and known as “Reverse vaccinology”. Objective: The objective of the present review is to unfold the various facts and figures of the highly valuable approach of Reverse vaccinology by emphasizing its advancements over the period of time. Conclusion: The methodology of Reverse vaccinology is elucidated in an easy and straightforward fashion. Also various available tools for T-cell and B-cell epitope prediction are discussed thoroughly for better understanding of the Reverse vaccinology.
  
  Add to my favourites
  
  Email this

- Computer Representations of Bioinformatics Models
  
  Authors: Tomasz Prejzendanc, Szymon Wasik and Jacek Blazewicz
  
  https://doi.org/10.2174/1574893610666150928193510
  More Less
  
  Background: The choice of a bioinformatics tool that is used to design and analyse bioinformatics models is usually influenced by the choice of format used to store the models on the hard drive and vice versa and may seem irrelevant to researchers who design and analyse biological models using high-level computer software. However, the method for saving created models, which is usually opaque to users, can have an important impact on their work and have several important consequences. Objective: Review various computer representations of bioinformatics models based on the structure of the files used to store them. Results: We defined three classes of formats—formats with the internal structure hidden from the user, specialized programming languages, and controlled natural languages—and provide examples as well as list advantages and disadvantages of each class. Finally, we present recommendations on the groups of tools researchers should use. Conclusion: Indeed, the choice of the format used to store the bioinformatics models can have important consequences for the entire investigation. A correctly chosen format provides all functionalities required by the modeller and offers high flexibility in analysing the model, making collaboration with other researchers easier and helping to develop the model further in the future. In our opinion, both users and developers tend to underestimate slightly the specialized programming languages and controlled natural languages which should be studied more in-depth.
  
  Add to my favourites
  
  Email this

- A Discriminative Feature Extraction Approach for Tumor Classification Using Gene Expression Data
  
  Authors: Qinglin Mei, Huaxiang Zhang and Cheng Liang
  
  https://doi.org/10.2174/1574893611666160728114747
  More Less
  
  Background: Tumor classification is one of the most important applications of gene expression data. Due to high dimensionality in microarray data, dimensionality reduction plays a crucial role in tumor classification based on gene expression profiles. Objective: The primary objective of this study is to increase the accuracy of tumor classification by reducing the dimensionality of gene expression data with feature extraction methods. Method: In this paper, we propose a novel supervised feature extraction method for tumor classification called discriminant hybrid structure preserving projections. The proposed method utilizes hybrid representation to efficiently characterize the structure of gene expression data, where both neighbor representation and sparse representation are taken into account. Specifically, our algorithm enhances the data separability after dimensionality reduction by simultaneously minimizing the within-class distance and maximizing the between-class distance. Moreover, it employs an imbalanced adjustment factor during the extraction process to overcome the class imbalance problem in tumor datasets. Results: Experiments on five publicly available tumor datasets demonstrate the effectiveness of the proposed method in comparison with a number of state-of-the-art feature extraction and feature selection methods. Conclusion: The proposed algorithm can enhance the separability of data after projections and thus improve the tumor classification accuracy of gene expression data.
  
  Add to my favourites
  
  Email this

- An Analytical RNA Secondary Structure Benchmark for the RNA Inverse Folding Problem
  
  Authors: Javad Mohammadzadeh, Mohammad Ganjtabesh and Abbas Nowzari-Dalini
  
  https://doi.org/10.2174/1574893611666160527100850
  More Less
  
  Background: RNA molecules play several fundamental roles in any living organism. The function of an RNA molecule is highly related to its three dimensional conformation which is referred to as RNA tertiary structure. Since the experimental determination or computational prediction of the RNA tertiary structure is very complicated, tremendous efforts have been focused on the relatively simple RNA secondary structure. Objective: One of the interesting problems in this context is the RNA inverse folding problem. The goal of this problem is to computationally design an RNA sequence that folds into the given secondary structure. Different methods have been proposed to solve this problem, each of which has been evaluated on a specific dataset regarding accuracy and reliability and therefore no standard benchmark is available to fairly compare these methods. Method: In this paper, an analytical RNA secondary structure benchmark is constructed that can be used to fairly compare the existing methods and to measure their abilities. The topological properties of a previously introduced variation network over the RNA shapes are employed for the selection of RNA structures and the construction of the benchmark. Results: All existing methods are evaluated using different measures, including success rate, execution time, Boltzmann probability, energy value, and range of energy. In addition, all the methods are compared and ranked against the mentioned measures. Using these evaluations, one can easily select an appropriate method for a specific usage.
  
  Add to my favourites
  
  Email this

- Modeling and Analyzing the Effects of Crosstalk in a Biochemical Pathway: A Study on Human mTOR Signaling Pathway
  
  Authors: Namrata Tomar and Rajat K. De
  
  https://doi.org/10.2174/1574893611666160808115416
  More Less
  
  Background: Crosstalk is the phenomenon in which two or more biochemical pathways interact with each other. In the presence of many inputs (cross talk) to a signaling pathway, there is a high chance of getting it excess activation. Therefore, to put a ‘brake’ over excessive activation, it has to put extra efforts in the form of regulatory loops. Objective: Design the crosstalk modeling study to analyze the effect of crosstalk on a biochemical pathway under study, and comparison of mTOR signaling pathway with and with no crosstalk. Methodology: We have modeled the crosstalk phenomenon in a signaling pathway, where the interacting pathway has been considered as a hypothetical interacting entity, termed as a ‘crosstalk node’. We have first implemented the methodology, viz., Flux Balance Analysis (FBA) over a synthetic system with feedback inhibition and crosstalk then on human mTOR signaling pathway to investigate the effect of crosstalk, along with feedback inhibition. Apart from analyzing crosstalk, we have also explored the idea of a ‘critical node’ in the form of complex TSC1/TSC2, for the first time, in mTOR pathway. We have modeled the crosstalk among the mTOR, Insulin, Wnt and MAPK pathways, and we represent the latter as ‘crosstalk nodes’. Results: We have obtained higher concentration for the regulators of the reactions, which induce feedback inhibition in the pathway, with crosstalk nodes, in comparison with the pathway having no crosstalk nodes. We have validated the results with existing experimental evidences. Conclusion: This is a novel way for pathway analysis, where one can integrate and model two pathway processes simultaneously to capture the impact of a pathway process on the other one. The major difference with the typical FBA is incorporation of concentration factor, feedback inhibition and crosstalk simultaneously into modeling aspect, which is the significance of this study.
  
  Add to my favourites
  
  Email this

- A Classification Method for Microarrays Based on Diversity
  
  Authors: Xubo Wang, Xiangxiang Zeng, Ying Ju, Yi Jiang, Zhujin Zhang and Wenqiang Chen
  
  https://doi.org/10.2174/1574893609666140820224436
  More Less
  
  Background: Analysis on classification of microarray gene expression data has been an important research topic in bioinformatics. Objective: For the unsatisfied performance of basic classification methods, researches on ensemble classifiers prove ensembling classifiers to be an efficient way to increase classification accuracy. Method: In this paper, we propose a new diversity-based classification method, which combines a feature selection method based on clustering and an ensemble classifier D3C to improve the classification accuracy. D3C is a novel ensemble method which utilizes ensemble pruning based on k-means clustering and dynamic selection and circulating combination aiming at obtaining diversity among classifiers. Results & Conclusion: We apply our proposed method on seven gene data sets. Compared to prior research, experimental results reveal that our method outperforms other ensemble classifiers in accuracy for gene classification.
  
  Add to my favourites
  
  Email this

- ACKNOWLEDGEMENTS TO THE REVIEWERS
  
  https://doi.org/10.2174/157489361105161101214417
  More Less
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 11, Issue 5, 2016

Volume 11, Issue 5, 2016

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed