Volume 9, Issue 2

Current Bioinformatics - Volume 9, Issue 2, 2014

Volume 9, Issue 2, 2014

- Editorial (Thematic Issue: Predict Various Types of Genes in Prokaryotic and Eukaryotic Genomes)
  
  By Feng-Biao Guo
  
  https://doi.org/10.2174/157489360902140313104032
  More Less
  
  Add to my favourites
  
  Email this

- Computational Methods for the Prediction of Microbial Essential Genes
  
  Authors: Yao Lu, Jingyuan Deng, Matthew B. Carson, Hui Lu and Long J. Lu
  
  https://doi.org/10.2174/1574893608999140109113434
  More Less
  
  Essential genes often play key roles in biological processes and mutations in these genes will have a great impact on an organism’s survival and reproduction. Studying lethal phenotypes will provide important information about the function of the gene product and direct gene therapy. Traditionally, essential genes have been identified through single-gene knockout experiments, transposon mutagenesis, or antisense RNA inhibitions. However, experimental methods are expensive, labor-intensive, and time-consuming. In addition, such experiments are not always possible as the vast majority of microorganisms are unculturable. Computational methods for genome-scale essential gene prediction, aided bythe explosion of genome-scale data provided by high-throughput technologies in recent years, provide an alternative way to study essential genes. Constraint-based modeling and machine learning technology have been used in this area and achieved promising results. Information such as protein sequence, network topology, gene expression data and other features have been used to predict essential genes. In this article, we will review recent bioinformatics progresses in the prediction of gene essentiality, including databases, computational methods, the most commonly used features, machine learning classifier comparisons, and feature selection. Finally, we will discuss the challenges and future directions of the field.
  
  Add to my favourites
  
  Email this

- Analyzing Gene Expression and Codon Usage Bias in Diverse Genomes Using a Variety of Models
  
  Authors: Satyabrata Sahoo and Shibsankar Das
  
  https://doi.org/10.2174/1574893608999140109114247
  More Less
  
  Synonymous codon usage has long been known as a factor that affects the average expression level of proteins in microorganisms. A systematic approach to study the role of codon usage bias underlying gene expression has been described. Facts and ideas presented in this short review are to derive biological information from genome sequences by means of various statistical analyses and appropriate design of algorithms. Using codon usage bias as a numerical estimator of gene expression, a comparative analysis of predicted highly expressed (PHE) genes was performed in bacteria, cyanobacteria, archaea, lower eukaryotes and higher eukaryotes. Here, it is suggested that both codon usage and as well as base compositions at three codon sites regulate the individual gene expression. Any correlation between gene length and expression level, however, remains unexplained. Relationship between gene expression levels and synonymous codon usage provides an important line of evidence for translational selection and suggests some general mechanism underlying protein evolution.
  
  Add to my favourites
  
  Email this

- Phylogenomics of Orthologous Protein Families in Prokaryotes: Comparison of Evolutionary Profiles
  
  Authors: Rampriya Ramarathnam and Shankar Subramaniam
  
  https://doi.org/10.2174/157489360902140313105531
  More Less
  
  The sequence similarity relationships between the members of a protein family contain information on its evolutionary history, such as the relative time of horizontal transfer events, and the differential acceleration or deceleration of evolution in particular organisms in response to selective pressures. This paper presents a quantitative representation and comparison of evolutionary profiles of proteins, and finds correlations between evolutionary profile similarities and evolutionary or functional links between proteins. Using a dataset of 84 orthologous protein families ubiquitous in prokaryotes, we obtain the evolutionary profile of each family as a vector of inter-sequence distances. We then compare the family-specific evolutionary vectors and quantitate the evolutionary similarity between families. Two primary methods for vector comparison were used, namely the angle between vectors and correlation distance between vectors. Both approaches are powerful enough to recognize known evolutionary similarities, and yield similar inter-family relationships, but they also display important differences. These differences are shown to exist because the two methods recognize different aspects of the evolutionary profile. The inter-vector angle is an effective measure of the difference in the overall form of phylogenetic trees even in cases where the topology of the tree is not well-defined, whereas the correlation distance is especially effective in recognizing similarities in topology. When the protein families are clustered based on either the angle or the correlation distance between them, the cluster dendrogram shows a core cluster consisting of ancient protein families with the standard phylogeny. In addition, evolutionary profile comparison also detects plausible evolutionary similarities between unannotated proteins and proteins of known function. For instance, the bacterial yjeF gene and the ygjD/ydiE gene are both predicted to be involved in cell envelope biogenesis. In summary, we describe quantitative comparisons of protein family specific evolutionary profiles, and illustrate their power in detecting broader evolutionary trends and specific functional relationships between proteins.
  
  Add to my favourites
  
  Email this

- Annotating Viral Genomes - A Cannon is Needed to Kill Mosquitoes
  
  By Shiliang Wang
  
  https://doi.org/10.2174/1574893608999140109115849
  More Less
  
  The majority of viruses have a small genome. However, these small genomes often have complex gene features with transcriptional and translational exceptions, for instance, gene overlapping, alternative splicing, RNA editing, ribosomal slippage and stop codon read-through. These complex features and exceptions increase gene density and improve the gene coding efficiency of viral genomes. They also pose immense challenges to gene prediction algorithms. Most gene prediction programs for eukaryotic and prokaryotic genomes cannot detect or predict these exceptions correctly. It is critical to predict these complex features and exceptions with high precision and accuracy in order to interpret viral genomic data correctly. This paper describes the most commonly used programs for viral gene predictions, focusing on the ab initio and similarity-based gene prediction programs, including GeneMarkS, ZCURVE_V, FgenesV, Phylo-HMM, MLOGD, GATU, VirGen, FLAN, VIGOR and others. Viral genome complex features and the basic algorithms of the gene prediction programs are introduced briefly, with identification of advantages and disadvantages, followed by a list of application scopes and specific features. Gene prediction programs for bacteriophages and viral meta-genomic sequences are reviewed separately. The last section of this review presents the future directions and challenges for viral gene prediction program development.
  
  Add to my favourites
  
  Email this

- Identification of Marker Genes for Cancer Based on Microarrays Using a Computational Biology Approach
  
  By Xiaosheng Wang
  
  https://doi.org/10.2174/1574893608999140109115649
  More Less
  
  Rapid advances in gene expression microarray technology have enabled to discover molecular markers used for cancer diagnosis, prognosis, and prediction. One computational challenge with using microarray data analysis to create cancer classifiers is how to effectively deal with microarray data which are composed of high-dimensional attributes (p) and low-dimensional instances (n). Gene selection and classifier construction are two key issues concerned with this topics. In this article, we reviewed major methods for computational identification of cancer marker genes based on microarray gene expression data. We concluded that simple methods should be preferred to complicated ones for their interpretability and applicability.
  
  Add to my favourites
  
  Email this

- A Review of the Computational Methods for Identifying the Over- Annotated Genes and Missing Genes in Microbial Genomes
  
  Authors: Jia-Feng Yu, Zhen-Zhen Guo, Xiao Sun and Ji-Hua Wang
  
  https://doi.org/10.2174/1574893608999140109120612
  More Less
  
  More and more studies indicate that the issue of protein-coding gene finding in microbial genomes is far from thoroughly solved and the annotation quality has been questioned continuously in the past several years. In this paper, we summarize the computational methods for identifying the over-annotated genes and missing genes, and provide perspective for prospective gene finding works.
  
  Add to my favourites
  
  Email this

- Prediction of Translation Initiation Site in Bacterial and Archaeal Genomes
  
  Authors: Huaiqiu Zhu and Qi Wang
  
  https://doi.org/10.2174/1574893608999140109120345
  More Less
  
  Driven by the rapid growth of the complete genome sequences, it is accepted that genome annotation has been resorted mostly to automatic methodology. For computational annotation to bacterial and archaeal genomes, accurate prediction of translation initiation sites (TISs) is essential to locate protein coding regions of genes. Therefore, TIS prediction has been a challenge to a number of gene finders and TIS processors, leading to recent studies of TIS prediction or correction in prokaryotic genome annotation as well as of the mechanism of translation initiation. It is time for the research community to review the available mathematical models of TIS of prokaryotic gene, and the resultant algorithms for a series of current TIS processors and TIS prediction modules in gene finders. In fact, the TIS models have been improved along with the knowledge of the mechanism of translation initiation. Several studies of the mechanism of translation initiation in prokaryotic genomes have been summarized. With a few of published data sets widely-used in evaluation of TIS identification, the performances of the existing methods are assessed and discussed in this article. It is also interesting to discuss the relation between the algorithms and the understanding of prokaryotic translation initiation mechanism, which can enlighten us on the state-of-the-art studies of TIS prediction in bacterial and archaeal genomes.
  
  Add to my favourites
  
  Email this

- Prediction and Classification of ABC Transporters in Geobacter sulfurreducens PCA Using Computational Approaches
  
  Authors: Ashok Selvaraj, Venil Sumantran, Nupoor Chowdhary and Gopal Ramesh Kumar
  
  https://doi.org/10.2174/1574893608999140109113236
  More Less
  
  Geobacter sulfurreducens PCA plays an important role in electricity production and bioremediation, as it can reduce uranium to an insoluble form, and uses organic compounds as electron donors. ATP-binding cassette (ABC) transporters are important as they regulate respiration and biofilm formation, which in turn affect the rate of electricity production and bioremediation. Thus we focused on identifying ABC transporters by functional genomic re-annotation, KEGG pathway analysis, and phylogenetic analysis of hypothetical proteins of G. sulfurreducens PCA. Our prediction is based on five 1-dimensional tools including BLAST, family, domain, orthologous groups and signature recognition search. We define a prediction of gene function with high confidence, when 3 or more functional prediction tools indicated the same function for a given hypothetical gene. From our integrated re-annotation approach, we predicted eleven new ABC transporters and phylogenetically sub-classified these genes based on sequence similarity with known sub-classes of ABC transporters. These eleven new ABC transporters were also identified by KEGG pathway analysis. Overall, these 11 newly predicted ABC proteins can be sub-classified as five ABC transporter substrate binding proteins, three ABC-2 type transporters, 2 permeases, and 1 phosphate transporter.
  
  Add to my favourites
  
  Email this

- A Survey on Computational Approaches to the Discovery of microRNA Genes
  
  By Ki-Bong Kim
  
  https://doi.org/10.2174/1574893608999140109113103
  More Less
  
  For quite a while, the main focus of molecular biology has been on DNA, being the carrier of the genetic code, with RNA being viewed merely as an intermediary player. However, lately it has become obvious that RNA plays much more important roles in the cellular regulatory mechanisms. The discovery of various new types of RNA has provided a further boost to RNA research. As a result, research into small regulatory RNA molecules and in particular microRNA (miRNA) has experienced an exponential gain in attention. miRNAs are short non-coding RNAs that regulate gene expression at the post-transcriptional level by directly cleaving targeted mRNAs or repressing translation. They are now recognized as one of the key regulators of gene expression, involved in almost every aspect of a cell life from cell differentiation to apoptosis. Since the discovery of the very first miRNAs, lin-4 and let-7, computational methods have been indispensable tools that complement experimental approaches to understand the biology of miRNAs. Computational approaches for miRNA studies can be classified into two main categories - miRNA gene finding and miRNA target prediction. This review focuses on miRNA gene finding, not miRNA target prediction that has been thoroughly reviewed in [1-5]. First, this paper briefly introduces the biological features of miRNA genes and summarizes the basic principles of in silico prediction. Next, concluding with some outlook and remarks, it provides a comprehensive survey of specific methods that have been proposed in the field.
  
  Add to my favourites
  
  Email this

- Unveiling Molecular Basis of Fertilisation in Scleractinian Corals Using Extensive Genomic Information
  
  By Akira Iguchi
  
  https://doi.org/10.2174/1574893608999140109113829
  More Less
  
  Many aspects of the reproductive biology of scleractinian corals remain unknown. External fertilisation during broadcast spawning events generates the potential for interspecific hybridisation, yet whilst efficient cross-fertilisation often occurs in vitro and species boundaries are blurred due to introgression, in nature hybridisation happens rarely even between sympatric and highly cross-fertile species. One potential explanation of the discrepancy between the observed and potential level of hybridisation is temporal partitioning. An essential first step towards resolving this apparent paradox is an understanding of the molecular basis of fertilisation in corals. Here I summarise those aspects of fertilisation mechanisms in some of the best-characterised animal systems (Vertebrata, Mollusca, Echinodermata) under the premise that these may provide insight into the mechanisms dictating fertilisation in spawning corals. This leads to propose that interactions involving integrins, ADAM family proteins and modifiers/co-receptors may underlie gamete interactions in corals. The identification of fast-evolving genes such as ADAMs also promises to provide candidates for roles in coral fertilisation. Overall in this review I describe how new genomic information can shed light on the molecular basis of coral fertilisation and this will help progress our understanding of the reproductive systems in this keystone group of reefbuilding organisms.
  
  Add to my favourites
  
  Email this

- DNA Physical Parameters Modulate Nucleosome Positioning in the Saccharomyces cerevisiae Genome
  
  Authors: Wei Chen, Hao Lin and Pengmian Feng
  
  https://doi.org/10.2174/1574893608999140109113708
  More Less
  
  Nucleosome positioning plays essential roles in various cellular processes. Although many efforts have been made in this area, the rules defining nucleosome positioning is still elusive. In the present study, DNA physical parameters derived from atomistic molecular dynamic simulations were introduced to analyze nucleosomal and linker DNA sequences. The distinct structural patterns between nucleosomal and linker sequences indicate that DNA physical parameters are suitable to describe nucleosomal DNA sequences and to reveal physical mechanisms of nucleosome positioning. Further analysis of DNA flexibility around regulatory regions indicates that nucleosome positioning is closely correlated with sequence flexibility. These results demonstrate that DNA physical parameters are useful for the in silico nucleosome positioning prediction.
  
  Add to my favourites
  
  Email this

- A Novel Average Measure Approach to the Identification of Native-Like Protein Structures Among Decoy Sets
  
  Authors: Juan Li, Caiyun Fang and Huisheng Fang
  
  https://doi.org/10.2174/1574893608666131203224654
  More Less
  
  It is a great challenge to predict a protein structure and this challenge has fascinated researchers in different disciplines for many years. Basically the prediction process mainly includes two steps. With the first step that the generation of prediction model increasing fast, the second step that the quality estimation of predicted model i.e. identification of models’ native like structure becomes more and more important. In this study, we developed a simple and effective approach to identify the native-like protein structures among a set of decoys. Three different average measures were used in our study as follows: the average rmsd (armsd), the average alignment score (AAS) and MAXSUB. This approach was evaluated by decoy set (Park-Levitt). Comparison of model quality revealed that a significant correlation existed between these parameters. For example, the average measure could be effectively used to identify native-like protein models. The performance of both armsd and AAS was better than that of clustering. Since many other measures could be used to assess the similarity between protein structures, other analogous approaches might be also useful for the identification of native-like proteins. Finally, data showed that its performance was better than that of other servers in predicting the targets in CASP6, CASP7, CASP9 and CASP10.
  
  Add to my favourites
  
  Email this

- Exploring the Regulation Mechanism of miRNAs Transcription Using Genomic and Epigenetic Functional Annotations
  
  Authors: Lihua Xie, Chun Liu, Honghui Yang, Hua Lin and Shenghua Sun
  
  https://doi.org/10.2174/157489361130800009
  More Less
  
  Uncovering the transcriptional regulation mechanism of mammalian miRNAs is crucial to understand the role of these tiny regulators in cellular processes. Based on 1,030 genomic and epigenetic functional annotations, we compared these features of regulatory regions between miRNAs and protein-coding genes. Support vector machine (SVM) was used to quantify the contribution of each annotation group or group combination in distinguishing miRNA regulatory regions. We observed fewer repetitive DNA elements and SNPs, but more intensive DNA methylation in the miRNA regulatory regions. On the contrary, there are more predicted CpG islands and higher H3K9me1 levels in the regulatory regions of protein-coding genes. This analysis indicated that epigenetic factors such as DNA methylation and specific histone marks are the most informative groups, suggesting epigenetic events may play a more important role in miRNA transcription than previously thought. Furthermore, these results also revealed the interactive effects among various genomic or epigenetic factors involved in miRNA transcription.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 9, Issue 2, 2014

Volume 9, Issue 2, 2014

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed