Current Bioinformatics - Volume 11, Issue 3, 2016
Volume 11, Issue 3, 2016
-
-
Advances on Computational Methods for Identifying the Targets of microRNAs: A Review
Authors: Pengjuan Zhang and Chenghua LiMicroRNAs (miRNAs) are small noncoding RNAs that regulate cellular functions by finely controlling the transcription and translation efficiency of their targets. Understanding the functions of miRNAs entails predicting the targets of miRNAs and dissecting the miRNA–messenger RNA regulatory network. An increasing number of computational methods for rapidly screening miRNA potential targets have been developed with the increased demand for the identification of miRNA targets. Improvements in current bioinformatics methods allow the efficient identification of miRNA targets. This review describes the progress in the prediction and identification of miRNA targets. It also discusses the remaining challenges and provides insights on future directions.
-
-
-
Efficient Gene Selection for Cancer Prognostic Biomarkers Using Swarm Optimization and Survival Analysis
The discovery of molecular prognostic cancer biomarkers is still a major scientific challenge. Some methodologies have been proposed to generate novel model biomarkers for clinical outcome using gene expression as predictors but involve some drawbacks. For example, (i) they heavily depend on a rank of the initial univariate relation to survival times, (ii) are unable to generate compact multivariate predictors, (iii) are based on survival models other than Cox, or (iv) use aggregation and transformations of expression values instead of the gene expression directly. These issues complicate the evaluation of biomarkers in clinical trials, its implementation in medical practice and obscures its biological association with cancer. We propose a particle swarm optimization search engine coupled to multivariate Cox survival model fitting, constraining the number of genes while minimizing for deviance residuals to identify prognostic biomarkers cancer. By evaluating the concordance index, Log-rank, correlation, the integrated discrimination improvement per feature and the number of variables significantly associated to survival times, we show that many compact and highly predictive models can be found for six cancer datasets and a simulated cohort. We also show that our algorithm generates a competitive population of multivariate models with a wide variety of gene combinations, including genes that could not be found by a univariate methodology. In comparisons with other methods such as LASSO, Ridge, and Elastic Net, our algorithm shows similar or better results. We conclude that our algorithm generates highly predictive and compact models for clinical outcomes with a unique gene content, and a superior or comparable prediction to other current feature selection methods. R and Java code are available in Supplementary Information and http://bioinformatica.mty.itesm.mx/?q=coxswarm.
-
-
-
A Fast Comparison Algorithm to Measure the Accuracy of Ortholog Clusters
Authors: Sunshin Kim and KyuBum KwackOrtholog clusters are very important for functional annotation and studies in comparative and evolutionary genomics. Their accuracy is, therefore, of considerable significance. However, it is very hard to calculate the accuracy of ortholog clusters because it takes too much time to compare every gene between both ortholog clusters due to huge search space in many clusters. This study presents a fast comparison algorithm designed to measure the accuracy of a set of predicted ortholog clusters (POCs) based on a standard set of reliable ortholog clusters (ROCs), which is manually curated. The first step of the method identifies sets of POCs and ROCs involved with overlapped genes using a procedure that searches and merges every element with a common ROC identification (ID) or a common POC ID recursively to reduce huge comparisons between both data sets in the following step, and the second step calculates similarity very quickly between POCs and ROCs by the least-move algorithm. Our approach is a fully-automated method for measuring the accuracy of a set of POCs based on Kegg Orthology (KO). In addition, 12 genomes were selected in different domains and used for comparing a similarity measure using our algorithm with a method to measure consistency, by which a POC is considered to be consistent if all genes of the POC belong to a ROC. This study concludes that the auxiliary process to reduce the great search space makes it very efficient to calculate the accuracy of similarity between ROCs and POCs and that our approach can provide more robust results than the current standard method based on the measurement of consistency.
-
-
-
HSS-Bin: An Unsupervised Metagenomic Binning Method Based on Hybrid Sequence Feature Recognition and Spectral Clustering
Authors: Xiao Ding, Chang-Chang Cao, Xu-Ying Liu, Fu-Dong Cheng, Xing Luo and Xiao SunRapidly developing next-generation sequencing technologies significantly promote metagenomics research, yet also present extreme challenges in the analysis of metagenomic data. Metagenomic samples can contain thousands of microbial species, thus, sequencing datasets can contain fragments from thousands of different genomes. Therefore, clustering the sequencing reads with their original genomes, namely, binning, is usually done to expedite further studies. Currently, binning methods are divided into two categories: supervised methods (which require reference genomes), and unsupervised methods (which do not). We present an unsupervised binning method that combines a novel sequence feature recognition method with a spectral clustering algorithm. The sequence feature is a hybrid of sequence correlation and sequence composition analyses. Simulation experiments, based on simulated and actual metagenomic datasets, suggest that the combination of sequence composition and an intrinsic correlation of oligonucleotides, both extracted from tetranucleotide analyses, performs better than any single feature. A spectral clustering algorithm, which is a high performance unsupervised clustering method, is also applied in our binning method. The method is available as an open source package called HSS-bin (Hybrid Sequence feature and Spectral clustering unsupervised metagenomic binning) at http://bioinfo.seu.edu.cn/HSS-bin/. We evaluated HSS-bin’s performance using both simulated and actual metagenomic datasets. Experimental results indicate that HSS-bin can handle metagenomic sequencing data with non-uniform species abundance, short sequences, and complex phylogenetic diversity with high accuracy. Our method performs well on actual metagenomic datasets and on datasets simulated from a complex metagenomic community.
-
-
-
3D-QSAR and Docking Simulation Studies of Some Benzopyrone Derivatives as Inhibitors for Breast Cancer Stem Cell Growth via PGlycoprotein Mediated Efflux
Authors: Anushree Tripathi and Krishna MisraBenzopyrone derivatives (Coumarins) are well known inhibitors of P-glycoprotein (P-gp) mediated efflux. The high expression level of these efflux proteins promotes the growth of breast cancer stem cells (CSCs). The activity of breast CSCs is directly affected by the inhibition of efflux proteins by benzopyrone derivatives. Ligand based pharmacophoric study and structure based docking studies have been exploited for assessing this inhibitory activity. Based on QSAR results, a three point pharmacophore comprising of one hydrogen bond acceptor (A) and two condensed aromatic groups (R) has been designed. The atom based QSAR study was conducted to predict partial least square (PLS) statistical factors for test and training data sets. Some specific amino acids have been demonstrated to be actively involved in the ligand protein interaction. These structural features of ligands and active site residues of target protein provide new pathways to develop therapeutically important drugs for the inhibition of breast cancer stem cells.
-
-
-
In Silico Study of Ethylene Biosynthesis: Seeking New Effectors of ACC Synthase and ACC Oxidase
Authors: Nada Ayadi, Sarra Aloui, Rabeb Shaiek, Oussama Rokbani, Faten Raboud and Sami FattouchThe phytohormone ethylene plays essential physiological roles throughout the life cycle of plants, mainly via promoting ripening and senescence of fruits and flowers. Accompanying the sharp climacteric increase in ethylene production, there is a surge of 1-aminocyclopropane-1-carboxylate (ACC) synthase (ACS) and ACC oxidase (ACO) activities. ACS converts S-adenosyl-L-methionine (SAM) to ACC, whereas ACO converts ACC to ethylene. Pretreating fruits or plants with specific ethylene-biosynthesis inhibitors can reduce the ripening and senescence for agronomic and commercial motives. In the present study, the in silico molecular docking method was used to predict the interaction of available 3D structures of both ACS (Malus domestica and Solanum lycopersicum) and ACO (Petunia hybrida) with a range of potential inhibitors. Obtained data revealed that the (2E,3E)-4-(2-aminoethoxy)-2-[({3-hydroxy-2-methyl-5[(phosphonooxy)methyl]pyridin-4- yl}methyl) imino] but-3-enoic acid (PPG) presents the best inhibitory effect on ACS (energy score, ΔG = -202.52 Kcal/mol), more than that of other widely reported vinylglycine and SAM analogues (ΔG > -118.22 kcal/mol). The present findings showed that 2-[(3-hydroxy-2-methyl-5-phosphonooxymethyl-pyridin-4-ylmethyl)-imino]-5-phosphonopent- 3-enoic acid (HEN), a PPG analogue exhibits a strong binding capacity (ΔG = -217.41 Kcal/mol) to the ACS supporting its potential use as a new effector to delay fruit ripening. The 2-amino-7-(4-methylphenyl)-7,8-dihydro-5(6H)- quinazolinone could be the more appropriate uncompetitive inhibitor of ACS (ΔG = -87.65 Kcal/mol). The 2-(9H-fluoren- 9-ylmethoxycarbonylamino)-2-methylpropanoic acid, a 2-aminoisobutyric acid analogue, has been found as a new preservative for plant (ΔG = -109.71 kcal/mol). The discovery of such chemical compounds will be helpful in ethylenebiosynthesis research and can offer potentially useful agrochemicals for quality improvement in post-harvest agricultural products that will benefit both local and export markets.
-
-
-
Amyloid Motif Prediction Using Ensemble Approach
Authors: Smitha Sunil Kumaran Nair and N. V. Subba ReddyMisfolding of proteins results in amyloidosis: a condition where amyloid motifs build up in neuronal tissues leading to life threatening organ failures. Hence understanding the underlying cause of incorrect folding of proteins is significant followed by the identification of such peptide motifs. This research effort proposes a distinctive ensemble approach by taking advantage of diverse fusion of structural information and sequence based features to predict amyloid motifs computationally. The assortment in the structure and sequence feature space owes to the structural statistics based on root mean square deviation and the sequence centered features by exploiting the sequence similarity to maintain sequence order effect and the physico-chemical properties attained after optimizing via a novel hybridization of machine learning classifier followed by swarm intelligence algorithm. The proposed approach resulted in considerably better predictive performances based on sensitivity, specificity and balanced accuracy than available predictors for discriminating amyloid motifs from non-amyloid motifs. Furthermore, it has been revealed that the effect of nested ensemble classifier and bootstrap evaluation protocol have significant role in ameliorating the prediction accuracy.
-
-
-
Identification of Differentially Expressed Gene Using Robust Singular Value Decomposition
Authors: Nishith Kumar, Md. Tofazzal Hossain, Eric J. Beh, Masahiro Sugimoto and Mohammed NasserIdentification of differentially expressed genes (DEG) in transcriptomic analyses is one of the important tasks to find out significantly activated/deactivated pathways. Outliers and/or the missing values are commonly observed in microarray data; however, most available statistical methods did not deal with these issues and, therefore, their analytical results were frequently skewed and deteriorated. Here, we developed a novel technique robust against outliers and missing values: a dimension reduction procedure based on robust singular value decomposition (RSVD). The RSVD was evaluated by two numerical experiments: artificially prepared and nonsmall cell lung cancer data (gene expression data). Four conventional techniques, such as Student’s t-test, SAM, Bayesian Robust Inference for Differential Gene Expression (BRIDGE) and Linear models for microarray data (Limma), were also performed. We evaluated the area under curve (AUC) form receiver operating characteristic curves of these five methods using two experiments with 50 different conditions. The AUC values of our methods showed significantly (p<0.05; Mann-Whitney test) higher than those of the other methods in both experiments. We believe our proposed technique is helpful for the identification of biologically meaningful genes that change in noisy microarray data.
-
-
-
A Comprehensive Analysis of Sequence Alignment Algorithms for LongRead Sequencing
Authors: Yu Zhang, Jian Tai He, Yangde Zhang and Ke ZuoAs the length of sequencing read increasing, greater bioinformation is demanded from longread aligner. The short-read aligner is often widely used to make alignment very fast and accurate, but the approach is ill-suited to finding longer, gapped alignments with long indels. A wide variety of alignment algorithms and aligners have been subsequently developed over the past few years. In this article, we survey the theoretical foundations that underlie long-read alignments and highlight the options and practical trade-offs that need to be considered. Through the evaluation of the sophisticated experiments both on simulated and real data, we illustrate the performance of these aligners on the accuracy, the time and memory cost, as well as the scalability for the modern multi-core architecture. We also consider the future development of long-read alignment algorithms.
-
-
-
A Computational Study of Three Frequent Mutations of EGFR and their Effects on Protein Dimer Formation and Non-Small Cell Lung Cancer Drug Resistance
Authors: Zhiyong Shen, Debby D. Wang, Lichun Ma, Hong Yan, Maria P. Wong and Victor H.F. LeeDrug resistance is a major problem for non-small cell lung cancer (NSCLC) treatment due to mutations in patients’ DNA sequences. It is now possible to obtain the human genome information easily based on the high-throughput sequencing technology, so personalized medicine can become a reality. Based on mutation data of 168 patients with stage IIIB and IV NSCLC. We use computational method to predict the homo-dimers and hetero-dimers formation and compute the binding free energy of complexes (between drugs and proteins). For the gefitinib and erlotinib as two common drugs used in patient's therapy, we compute the possible 3D structure of epidermal growth factor receptor (EGFR) mutant- inhibitor complex. Rosetta and Amber are used for molecular dynamics analysis and simulation. The PRISM protocol is used to predict the binding energy based on similar protein-protein interaction surfaces. Multiple factors, including the mutant proteins surface geometry change, the number of hydrogen bonds change and the electronic change of the surface, are taken into account when in evaluating the binding free energy. Our results suggest that the mutation position is very important for dimer formation and it affects the drug’s binding strength with EGFR. Mutations such as L858R and T790M which do not happen on the protein interaction surface can hardly affect the formation of dimers. Patients with the delE746_A750 mutation can obtain a good therapy by using gefitinib instead of erlotinib. By comparing the binding free energy to form a homo- or heterodimers, we find that the L858R mutant will incline to form a hetero-dimer rather than a homo-dimer.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
