Current Bioinformatics - Volume 11, Issue 5, 2016
Volume 11, Issue 5, 2016
-
-
An HMM-Based Text Classifier Less Sensitive to Document Management Problems
Authors: Adrián S. Vieira, Eva L. Iglesias and Lourdes B. DizBackground: The performance of the text classification techniques is commonly affected by the characteristics and representation of the document corpora itself. Of all the problems arising from the corpus, there are three major difficulties which the classifiers must deal with: the feature selection issues, the class imbalance problem and the size of the training set. Objective: The objective of this paper is to present a novel based-content text classifier called T-LHMM that is less sensitive to the text representation and the size of the corpus, and more efficient in terms of running time than other classification techniques. Method: In order to demonstrate it, we present a set of experiments performed on well-known biomedical text corpora. We also compare our classifier with k-Nearest Neighbours and Support Vector Machine models. Results and Conclusion: The experimental and statistical results show that the proposed HMM-based text classifier is indeed less sensitive to the class imbalance, the size of the corpus and the vocabulary than the other classifiers. In addition, it is more efficient in terms of running time than k-NN and SVM techniques.
-
-
-
An Agent-Based Model to Associate Genomic and Environmental Data for Phenotypic Prediction in Plants
Authors: Sebastien Alameda, Jean-Pierre Mano, Carole Bernon and Sebastien MellaBackground: One of the means to increase in-field crop yields is the use of software tools to predict future yield values using past in-field trials and plant genetics. The traditional, statistics-based approaches lack environmental data integration and are very sensitive to missing and/or noisy data. Objective: In this paper, we show that a cooperative, adaptive Multi-Agent System can overcome the drawbacks of such algorithms. Method: The system resolves the problem in an iterative way by a cooperation between the constraints, modelled as agents. Results: Results show that the Agent-Based Model gives results comparable to other approaches, without having to preprocess or reconcile data. Conclusion: This collective and self-adaptive search of a solution functions like a heuristic to efficiently explore the solution space and is therefore able to consider both genetic and environmental data.
-
-
-
Reconstruction of the Network of Experimentally Validated AMP-Drug Combinations Against Pseudomonas aeruginosa Infections
Background: The combination of antimicrobial products is a promising biomedical strategy against the ever growing number of resistant strains emerging in healthcare and community settings. Agents with alternative modes of action, such as antimicrobial peptides (AMPs), and the efficacy of combined actions are being evaluated. Despite the availability of various antimicrobial data repositories, a wealth of information remains scattered through the scientific literature. Objective: The aim of this work is to provide a global view of available interaction data and help design new antimicrobial studies. Method: We implemented an automated curation pipeline to produce the first ever network reconstruction of AMP-drug combinations. Results: This network relates to antimicrobial combinations experimentally tested against Pseudomonas aeruginosa infections and includes 239 combinations among AMPs and other antimicrobials. Conclusions: Reconstruction is on-going, coping with new experimental results for P. aeruginosa, and will be soon extended to other meaningful microbial pathogens. The network is publicly accessible at http://sing.ei.uvigo.es/antimicrobialCombination/.
-
-
-
Biological Network Derivation by Positive Unlabeled Learning Algorithms
Authors: Doruk Pancaroglu and Mehmet TanBackground: In cases where only a single group (or class) of samples is available for a given problem, positive unlabeled learning algorithms can be applied. One such case is the interactions between various biological/chemical entity pairs, where only the set of interacting entities can be collected, not the “noninteracting” ones. Objective: We aim to improve the performance of deriving protein-protein and protein-ligand interactions. We argue that the positive-unlabeled learning algorithms can be applied to this problem. Method: In this paper, we propose some modifications to two of the existing methods for protein-protein and protein-ligand interaction network derivation. First, we extend the algorithms to use Random Forests and then we devise an ensemble classifier from these two based on voting. Results: We report the evaluation results of the proposed algorithms in comparison to the original methods and well-known biological network derivation algorithms. We achieved significant improvements in terms of different metrics. Conclusion: The results are promising in the sense that proposed methods either perform competitively or better than previous methods. This motivates us in applying the proposed methods to other data sets and similar problems.
-
-
-
Reverse Vaccinology: An Epitope Based Approach to Design Vaccines
Authors: Mehak Dangi, Bharat Singh and Anil K. ChhillarBackground: It has been noticed that the conventional approaches of vaccinology are highly successful to bring down the rate of disease incidences of life threatening infections in 20th century. But these approaches certainly suffer with some delimitation. Such as they take decades to unravel pathogens and antigens related to a disease and even then only few but not all of the antigens are disclosed. Therefore, persistent need of a revolution in this field was required which certainly appeared a decade before and known as “Reverse vaccinology”. Objective: The objective of the present review is to unfold the various facts and figures of the highly valuable approach of Reverse vaccinology by emphasizing its advancements over the period of time. Conclusion: The methodology of Reverse vaccinology is elucidated in an easy and straightforward fashion. Also various available tools for T-cell and B-cell epitope prediction are discussed thoroughly for better understanding of the Reverse vaccinology.
-
-
-
Computer Representations of Bioinformatics Models
Authors: Tomasz Prejzendanc, Szymon Wasik and Jacek BlazewiczBackground: The choice of a bioinformatics tool that is used to design and analyse bioinformatics models is usually influenced by the choice of format used to store the models on the hard drive and vice versa and may seem irrelevant to researchers who design and analyse biological models using high-level computer software. However, the method for saving created models, which is usually opaque to users, can have an important impact on their work and have several important consequences. Objective: Review various computer representations of bioinformatics models based on the structure of the files used to store them. Results: We defined three classes of formats—formats with the internal structure hidden from the user, specialized programming languages, and controlled natural languages—and provide examples as well as list advantages and disadvantages of each class. Finally, we present recommendations on the groups of tools researchers should use. Conclusion: Indeed, the choice of the format used to store the bioinformatics models can have important consequences for the entire investigation. A correctly chosen format provides all functionalities required by the modeller and offers high flexibility in analysing the model, making collaboration with other researchers easier and helping to develop the model further in the future. In our opinion, both users and developers tend to underestimate slightly the specialized programming languages and controlled natural languages which should be studied more in-depth.
-
-
-
A Discriminative Feature Extraction Approach for Tumor Classification Using Gene Expression Data
Authors: Qinglin Mei, Huaxiang Zhang and Cheng LiangBackground: Tumor classification is one of the most important applications of gene expression data. Due to high dimensionality in microarray data, dimensionality reduction plays a crucial role in tumor classification based on gene expression profiles. Objective: The primary objective of this study is to increase the accuracy of tumor classification by reducing the dimensionality of gene expression data with feature extraction methods. Method: In this paper, we propose a novel supervised feature extraction method for tumor classification called discriminant hybrid structure preserving projections. The proposed method utilizes hybrid representation to efficiently characterize the structure of gene expression data, where both neighbor representation and sparse representation are taken into account. Specifically, our algorithm enhances the data separability after dimensionality reduction by simultaneously minimizing the within-class distance and maximizing the between-class distance. Moreover, it employs an imbalanced adjustment factor during the extraction process to overcome the class imbalance problem in tumor datasets. Results: Experiments on five publicly available tumor datasets demonstrate the effectiveness of the proposed method in comparison with a number of state-of-the-art feature extraction and feature selection methods. Conclusion: The proposed algorithm can enhance the separability of data after projections and thus improve the tumor classification accuracy of gene expression data.
-
-
-
An Analytical RNA Secondary Structure Benchmark for the RNA Inverse Folding Problem
Authors: Javad Mohammadzadeh, Mohammad Ganjtabesh and Abbas Nowzari-DaliniBackground: RNA molecules play several fundamental roles in any living organism. The function of an RNA molecule is highly related to its three dimensional conformation which is referred to as RNA tertiary structure. Since the experimental determination or computational prediction of the RNA tertiary structure is very complicated, tremendous efforts have been focused on the relatively simple RNA secondary structure. Objective: One of the interesting problems in this context is the RNA inverse folding problem. The goal of this problem is to computationally design an RNA sequence that folds into the given secondary structure. Different methods have been proposed to solve this problem, each of which has been evaluated on a specific dataset regarding accuracy and reliability and therefore no standard benchmark is available to fairly compare these methods. Method: In this paper, an analytical RNA secondary structure benchmark is constructed that can be used to fairly compare the existing methods and to measure their abilities. The topological properties of a previously introduced variation network over the RNA shapes are employed for the selection of RNA structures and the construction of the benchmark. Results: All existing methods are evaluated using different measures, including success rate, execution time, Boltzmann probability, energy value, and range of energy. In addition, all the methods are compared and ranked against the mentioned measures. Using these evaluations, one can easily select an appropriate method for a specific usage.
-
-
-
Modeling and Analyzing the Effects of Crosstalk in a Biochemical Pathway: A Study on Human mTOR Signaling Pathway
Authors: Namrata Tomar and Rajat K. DeBackground: Crosstalk is the phenomenon in which two or more biochemical pathways interact with each other. In the presence of many inputs (cross talk) to a signaling pathway, there is a high chance of getting it excess activation. Therefore, to put a ‘brake’ over excessive activation, it has to put extra efforts in the form of regulatory loops. Objective: Design the crosstalk modeling study to analyze the effect of crosstalk on a biochemical pathway under study, and comparison of mTOR signaling pathway with and with no crosstalk. Methodology: We have modeled the crosstalk phenomenon in a signaling pathway, where the interacting pathway has been considered as a hypothetical interacting entity, termed as a ‘crosstalk node’. We have first implemented the methodology, viz., Flux Balance Analysis (FBA) over a synthetic system with feedback inhibition and crosstalk then on human mTOR signaling pathway to investigate the effect of crosstalk, along with feedback inhibition. Apart from analyzing crosstalk, we have also explored the idea of a ‘critical node’ in the form of complex TSC1/TSC2, for the first time, in mTOR pathway. We have modeled the crosstalk among the mTOR, Insulin, Wnt and MAPK pathways, and we represent the latter as ‘crosstalk nodes’. Results: We have obtained higher concentration for the regulators of the reactions, which induce feedback inhibition in the pathway, with crosstalk nodes, in comparison with the pathway having no crosstalk nodes. We have validated the results with existing experimental evidences. Conclusion: This is a novel way for pathway analysis, where one can integrate and model two pathway processes simultaneously to capture the impact of a pathway process on the other one. The major difference with the typical FBA is incorporation of concentration factor, feedback inhibition and crosstalk simultaneously into modeling aspect, which is the significance of this study.
-
-
-
A Classification Method for Microarrays Based on Diversity
Authors: Xubo Wang, Xiangxiang Zeng, Ying Ju, Yi Jiang, Zhujin Zhang and Wenqiang ChenBackground: Analysis on classification of microarray gene expression data has been an important research topic in bioinformatics. Objective: For the unsatisfied performance of basic classification methods, researches on ensemble classifiers prove ensembling classifiers to be an efficient way to increase classification accuracy. Method: In this paper, we propose a new diversity-based classification method, which combines a feature selection method based on clustering and an ensemble classifier D3C to improve the classification accuracy. D3C is a novel ensemble method which utilizes ensemble pruning based on k-means clustering and dynamic selection and circulating combination aiming at obtaining diversity among classifiers. Results & Conclusion: We apply our proposed method on seven gene data sets. Compared to prior research, experimental results reveal that our method outperforms other ensemble classifiers in accuracy for gene classification.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
