Volume 11, Issue 3

Current Bioinformatics - Volume 11, Issue 3, 2016

Volume 11, Issue 3, 2016

- Meet Our Editorial Board Member:
  
  By Feng-Biao Guo
  
  https://doi.org/10.2174/157489361103160615150634
  More Less
  
  Add to my favourites
  
  Email this

- EDITORIAL:
  
  By Yi-Ping Phoebe Chen
  
  https://doi.org/10.2174/157489361103160615150833
  More Less
  
  Add to my favourites
  
  Email this

- Advances on Computational Methods for Identifying the Targets of microRNAs: A Review
  
  Authors: Pengjuan Zhang and Chenghua Li
  
  https://doi.org/10.2174/1574893611666160114235726
  More Less
  
  MicroRNAs (miRNAs) are small noncoding RNAs that regulate cellular functions by finely controlling the transcription and translation efficiency of their targets. Understanding the functions of miRNAs entails predicting the targets of miRNAs and dissecting the miRNA–messenger RNA regulatory network. An increasing number of computational methods for rapidly screening miRNA potential targets have been developed with the increased demand for the identification of miRNA targets. Improvements in current bioinformatics methods allow the efficient identification of miRNA targets. This review describes the progress in the prediction and identification of miRNA targets. It also discusses the remaining challenges and provides insights on future directions.
  
  Add to my favourites
  
  Email this

- Efficient Gene Selection for Cancer Prognostic Biomarkers Using Swarm Optimization and Survival Analysis
  
  Authors: Raul Aguirre-Gamboa, Emmanuel Martinez-Ledesma, Hugo Gomez-Rueda, Rebeca Palacios, Isabel Fuentes-Hernandez, Emilio Sánchez-Canales, Rafael Chacolla-Huaringa, Servando Cardona-Huerta, Luis Villela, Sean-Patrick Scott, Jose Tamez-Pena and Victor Trevino
  
  https://doi.org/10.2174/1574893611999160610125628
  More Less
  
  The discovery of molecular prognostic cancer biomarkers is still a major scientific challenge. Some methodologies have been proposed to generate novel model biomarkers for clinical outcome using gene expression as predictors but involve some drawbacks. For example, (i) they heavily depend on a rank of the initial univariate relation to survival times, (ii) are unable to generate compact multivariate predictors, (iii) are based on survival models other than Cox, or (iv) use aggregation and transformations of expression values instead of the gene expression directly. These issues complicate the evaluation of biomarkers in clinical trials, its implementation in medical practice and obscures its biological association with cancer. We propose a particle swarm optimization search engine coupled to multivariate Cox survival model fitting, constraining the number of genes while minimizing for deviance residuals to identify prognostic biomarkers cancer. By evaluating the concordance index, Log-rank, correlation, the integrated discrimination improvement per feature and the number of variables significantly associated to survival times, we show that many compact and highly predictive models can be found for six cancer datasets and a simulated cohort. We also show that our algorithm generates a competitive population of multivariate models with a wide variety of gene combinations, including genes that could not be found by a univariate methodology. In comparisons with other methods such as LASSO, Ridge, and Elastic Net, our algorithm shows similar or better results. We conclude that our algorithm generates highly predictive and compact models for clinical outcomes with a unique gene content, and a superior or comparable prediction to other current feature selection methods. R and Java code are available in Supplementary Information and http://bioinformatica.mty.itesm.mx/?q=coxswarm.
  
  Add to my favourites
  
  Email this

- A Fast Comparison Algorithm to Measure the Accuracy of Ortholog Clusters
  
  Authors: Sunshin Kim and KyuBum Kwack
  
  https://doi.org/10.2174/1574893611666160322233309
  More Less
  
  Ortholog clusters are very important for functional annotation and studies in comparative and evolutionary genomics. Their accuracy is, therefore, of considerable significance. However, it is very hard to calculate the accuracy of ortholog clusters because it takes too much time to compare every gene between both ortholog clusters due to huge search space in many clusters. This study presents a fast comparison algorithm designed to measure the accuracy of a set of predicted ortholog clusters (POCs) based on a standard set of reliable ortholog clusters (ROCs), which is manually curated. The first step of the method identifies sets of POCs and ROCs involved with overlapped genes using a procedure that searches and merges every element with a common ROC identification (ID) or a common POC ID recursively to reduce huge comparisons between both data sets in the following step, and the second step calculates similarity very quickly between POCs and ROCs by the least-move algorithm. Our approach is a fully-automated method for measuring the accuracy of a set of POCs based on Kegg Orthology (KO). In addition, 12 genomes were selected in different domains and used for comparing a similarity measure using our algorithm with a method to measure consistency, by which a POC is considered to be consistent if all genes of the POC belong to a ROC. This study concludes that the auxiliary process to reduce the great search space makes it very efficient to calculate the accuracy of similarity between ROCs and POCs and that our approach can provide more robust results than the current standard method based on the measurement of consistency.
  
  Add to my favourites
  
  Email this

- HSS-Bin: An Unsupervised Metagenomic Binning Method Based on Hybrid Sequence Feature Recognition and Spectral Clustering
  
  Authors: Xiao Ding, Chang-Chang Cao, Xu-Ying Liu, Fu-Dong Cheng, Xing Luo and Xiao Sun
  
  https://doi.org/10.2174/1574893611666151203222815
  More Less
  
  Rapidly developing next-generation sequencing technologies significantly promote metagenomics research, yet also present extreme challenges in the analysis of metagenomic data. Metagenomic samples can contain thousands of microbial species, thus, sequencing datasets can contain fragments from thousands of different genomes. Therefore, clustering the sequencing reads with their original genomes, namely, binning, is usually done to expedite further studies. Currently, binning methods are divided into two categories: supervised methods (which require reference genomes), and unsupervised methods (which do not). We present an unsupervised binning method that combines a novel sequence feature recognition method with a spectral clustering algorithm. The sequence feature is a hybrid of sequence correlation and sequence composition analyses. Simulation experiments, based on simulated and actual metagenomic datasets, suggest that the combination of sequence composition and an intrinsic correlation of oligonucleotides, both extracted from tetranucleotide analyses, performs better than any single feature. A spectral clustering algorithm, which is a high performance unsupervised clustering method, is also applied in our binning method. The method is available as an open source package called HSS-bin (Hybrid Sequence feature and Spectral clustering unsupervised metagenomic binning) at http://bioinfo.seu.edu.cn/HSS-bin/. We evaluated HSS-bin’s performance using both simulated and actual metagenomic datasets. Experimental results indicate that HSS-bin can handle metagenomic sequencing data with non-uniform species abundance, short sequences, and complex phylogenetic diversity with high accuracy. Our method performs well on actual metagenomic datasets and on datasets simulated from a complex metagenomic community.
  
  Add to my favourites
  
  Email this

- 3D-QSAR and Docking Simulation Studies of Some Benzopyrone Derivatives as Inhibitors for Breast Cancer Stem Cell Growth via PGlycoprotein Mediated Efflux
  
  Authors: Anushree Tripathi and Krishna Misra
  
  https://doi.org/10.2174/1574893611999160610125100
  More Less
  
  Benzopyrone derivatives (Coumarins) are well known inhibitors of P-glycoprotein (P-gp) mediated efflux. The high expression level of these efflux proteins promotes the growth of breast cancer stem cells (CSCs). The activity of breast CSCs is directly affected by the inhibition of efflux proteins by benzopyrone derivatives. Ligand based pharmacophoric study and structure based docking studies have been exploited for assessing this inhibitory activity. Based on QSAR results, a three point pharmacophore comprising of one hydrogen bond acceptor (A) and two condensed aromatic groups (R) has been designed. The atom based QSAR study was conducted to predict partial least square (PLS) statistical factors for test and training data sets. Some specific amino acids have been demonstrated to be actively involved in the ligand protein interaction. These structural features of ligands and active site residues of target protein provide new pathways to develop therapeutically important drugs for the inhibition of breast cancer stem cells.
  
  Add to my favourites
  
  Email this

- In Silico Study of Ethylene Biosynthesis: Seeking New Effectors of ACC Synthase and ACC Oxidase
  
  Authors: Nada Ayadi, Sarra Aloui, Rabeb Shaiek, Oussama Rokbani, Faten Raboud and Sami Fattouch
  
  https://doi.org/10.2174/1574893611666151228191020
  More Less
  
  The phytohormone ethylene plays essential physiological roles throughout the life cycle of plants, mainly via promoting ripening and senescence of fruits and flowers. Accompanying the sharp climacteric increase in ethylene production, there is a surge of 1-aminocyclopropane-1-carboxylate (ACC) synthase (ACS) and ACC oxidase (ACO) activities. ACS converts S-adenosyl-L-methionine (SAM) to ACC, whereas ACO converts ACC to ethylene. Pretreating fruits or plants with specific ethylene-biosynthesis inhibitors can reduce the ripening and senescence for agronomic and commercial motives. In the present study, the in silico molecular docking method was used to predict the interaction of available 3D structures of both ACS (Malus domestica and Solanum lycopersicum) and ACO (Petunia hybrida) with a range of potential inhibitors. Obtained data revealed that the (2E,3E)-4-(2-aminoethoxy)-2-[({3-hydroxy-2-methyl-5[(phosphonooxy)methyl]pyridin-4- yl}methyl) imino] but-3-enoic acid (PPG) presents the best inhibitory effect on ACS (energy score, ΔG = -202.52 Kcal/mol), more than that of other widely reported vinylglycine and SAM analogues (ΔG > -118.22 kcal/mol). The present findings showed that 2-[(3-hydroxy-2-methyl-5-phosphonooxymethyl-pyridin-4-ylmethyl)-imino]-5-phosphonopent- 3-enoic acid (HEN), a PPG analogue exhibits a strong binding capacity (ΔG = -217.41 Kcal/mol) to the ACS supporting its potential use as a new effector to delay fruit ripening. The 2-amino-7-(4-methylphenyl)-7,8-dihydro-5(6H)- quinazolinone could be the more appropriate uncompetitive inhibitor of ACS (ΔG = -87.65 Kcal/mol). The 2-(9H-fluoren- 9-ylmethoxycarbonylamino)-2-methylpropanoic acid, a 2-aminoisobutyric acid analogue, has been found as a new preservative for plant (ΔG = -109.71 kcal/mol). The discovery of such chemical compounds will be helpful in ethylenebiosynthesis research and can offer potentially useful agrochemicals for quality improvement in post-harvest agricultural products that will benefit both local and export markets.
  
  Add to my favourites
  
  Email this

- Amyloid Motif Prediction Using Ensemble Approach
  
  Authors: Smitha Sunil Kumaran Nair and N. V. Subba Reddy
  
  https://doi.org/10.2174/1574893611666151231185707
  More Less
  
  Misfolding of proteins results in amyloidosis: a condition where amyloid motifs build up in neuronal tissues leading to life threatening organ failures. Hence understanding the underlying cause of incorrect folding of proteins is significant followed by the identification of such peptide motifs. This research effort proposes a distinctive ensemble approach by taking advantage of diverse fusion of structural information and sequence based features to predict amyloid motifs computationally. The assortment in the structure and sequence feature space owes to the structural statistics based on root mean square deviation and the sequence centered features by exploiting the sequence similarity to maintain sequence order effect and the physico-chemical properties attained after optimizing via a novel hybridization of machine learning classifier followed by swarm intelligence algorithm. The proposed approach resulted in considerably better predictive performances based on sensitivity, specificity and balanced accuracy than available predictors for discriminating amyloid motifs from non-amyloid motifs. Furthermore, it has been revealed that the effect of nested ensemble classifier and bootstrap evaluation protocol have significant role in ameliorating the prediction accuracy.
  
  Add to my favourites
  
  Email this

- Identification of Differentially Expressed Gene Using Robust Singular Value Decomposition
  
  Authors: Nishith Kumar, Md. Tofazzal Hossain, Eric J. Beh, Masahiro Sugimoto and Mohammed Nasser
  
  https://doi.org/10.2174/1574893611999160610124913
  More Less
  
  Identification of differentially expressed genes (DEG) in transcriptomic analyses is one of the important tasks to find out significantly activated/deactivated pathways. Outliers and/or the missing values are commonly observed in microarray data; however, most available statistical methods did not deal with these issues and, therefore, their analytical results were frequently skewed and deteriorated. Here, we developed a novel technique robust against outliers and missing values: a dimension reduction procedure based on robust singular value decomposition (RSVD). The RSVD was evaluated by two numerical experiments: artificially prepared and nonsmall cell lung cancer data (gene expression data). Four conventional techniques, such as Student’s t-test, SAM, Bayesian Robust Inference for Differential Gene Expression (BRIDGE) and Linear models for microarray data (Limma), were also performed. We evaluated the area under curve (AUC) form receiver operating characteristic curves of these five methods using two experiments with 50 different conditions. The AUC values of our methods showed significantly (p<0.05; Mann-Whitney test) higher than those of the other methods in both experiments. We believe our proposed technique is helpful for the identification of biologically meaningful genes that change in noisy microarray data.
  
  Add to my favourites
  
  Email this

- A Comprehensive Analysis of Sequence Alignment Algorithms for LongRead Sequencing
  
  Authors: Yu Zhang, Jian Tai He, Yangde Zhang and Ke Zuo
  
  https://doi.org/10.2174/1574893611666160115213144
  More Less
  
  As the length of sequencing read increasing, greater bioinformation is demanded from longread aligner. The short-read aligner is often widely used to make alignment very fast and accurate, but the approach is ill-suited to finding longer, gapped alignments with long indels. A wide variety of alignment algorithms and aligners have been subsequently developed over the past few years. In this article, we survey the theoretical foundations that underlie long-read alignments and highlight the options and practical trade-offs that need to be considered. Through the evaluation of the sophisticated experiments both on simulated and real data, we illustrate the performance of these aligners on the accuracy, the time and memory cost, as well as the scalability for the modern multi-core architecture. We also consider the future development of long-read alignment algorithms.
  
  Add to my favourites
  
  Email this

- A Computational Study of Three Frequent Mutations of EGFR and their Effects on Protein Dimer Formation and Non-Small Cell Lung Cancer Drug Resistance
  
  Authors: Zhiyong Shen, Debby D. Wang, Lichun Ma, Hong Yan, Maria P. Wong and Victor H.F. Lee
  
  https://doi.org/10.2174/1574893611666160322233746
  More Less
  
  Drug resistance is a major problem for non-small cell lung cancer (NSCLC) treatment due to mutations in patients’ DNA sequences. It is now possible to obtain the human genome information easily based on the high-throughput sequencing technology, so personalized medicine can become a reality. Based on mutation data of 168 patients with stage IIIB and IV NSCLC. We use computational method to predict the homo-dimers and hetero-dimers formation and compute the binding free energy of complexes (between drugs and proteins). For the gefitinib and erlotinib as two common drugs used in patient's therapy, we compute the possible 3D structure of epidermal growth factor receptor (EGFR) mutant- inhibitor complex. Rosetta and Amber are used for molecular dynamics analysis and simulation. The PRISM protocol is used to predict the binding energy based on similar protein-protein interaction surfaces. Multiple factors, including the mutant proteins surface geometry change, the number of hydrogen bonds change and the electronic change of the surface, are taken into account when in evaluating the binding free energy. Our results suggest that the mutation position is very important for dimer formation and it affects the drug’s binding strength with EGFR. Mutations such as L858R and T790M which do not happen on the protein interaction surface can hardly affect the formation of dimers. Patients with the delE746_A750 mutation can obtain a good therapy by using gefitinib instead of erlotinib. By comparing the binding free energy to form a homo- or heterodimers, we find that the L858R mutant will incline to form a hetero-dimer rather than a homo-dimer.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 11, Issue 3, 2016

Volume 11, Issue 3, 2016

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed