Current Bioinformatics - Volume 9, Issue 2, 2014
Volume 9, Issue 2, 2014
-
-
Computational Methods for the Prediction of Microbial Essential Genes
Authors: Yao Lu, Jingyuan Deng, Matthew B. Carson, Hui Lu and Long J. LuEssential genes often play key roles in biological processes and mutations in these genes will have a great impact on an organism’s survival and reproduction. Studying lethal phenotypes will provide important information about the function of the gene product and direct gene therapy. Traditionally, essential genes have been identified through single-gene knockout experiments, transposon mutagenesis, or antisense RNA inhibitions. However, experimental methods are expensive, labor-intensive, and time-consuming. In addition, such experiments are not always possible as the vast majority of microorganisms are unculturable. Computational methods for genome-scale essential gene prediction, aided bythe explosion of genome-scale data provided by high-throughput technologies in recent years, provide an alternative way to study essential genes. Constraint-based modeling and machine learning technology have been used in this area and achieved promising results. Information such as protein sequence, network topology, gene expression data and other features have been used to predict essential genes. In this article, we will review recent bioinformatics progresses in the prediction of gene essentiality, including databases, computational methods, the most commonly used features, machine learning classifier comparisons, and feature selection. Finally, we will discuss the challenges and future directions of the field.
-
-
-
Analyzing Gene Expression and Codon Usage Bias in Diverse Genomes Using a Variety of Models
Authors: Satyabrata Sahoo and Shibsankar DasSynonymous codon usage has long been known as a factor that affects the average expression level of proteins in microorganisms. A systematic approach to study the role of codon usage bias underlying gene expression has been described. Facts and ideas presented in this short review are to derive biological information from genome sequences by means of various statistical analyses and appropriate design of algorithms. Using codon usage bias as a numerical estimator of gene expression, a comparative analysis of predicted highly expressed (PHE) genes was performed in bacteria, cyanobacteria, archaea, lower eukaryotes and higher eukaryotes. Here, it is suggested that both codon usage and as well as base compositions at three codon sites regulate the individual gene expression. Any correlation between gene length and expression level, however, remains unexplained. Relationship between gene expression levels and synonymous codon usage provides an important line of evidence for translational selection and suggests some general mechanism underlying protein evolution.
-
-
-
Phylogenomics of Orthologous Protein Families in Prokaryotes: Comparison of Evolutionary Profiles
Authors: Rampriya Ramarathnam and Shankar SubramaniamThe sequence similarity relationships between the members of a protein family contain information on its evolutionary history, such as the relative time of horizontal transfer events, and the differential acceleration or deceleration of evolution in particular organisms in response to selective pressures. This paper presents a quantitative representation and comparison of evolutionary profiles of proteins, and finds correlations between evolutionary profile similarities and evolutionary or functional links between proteins. Using a dataset of 84 orthologous protein families ubiquitous in prokaryotes, we obtain the evolutionary profile of each family as a vector of inter-sequence distances. We then compare the family-specific evolutionary vectors and quantitate the evolutionary similarity between families. Two primary methods for vector comparison were used, namely the angle between vectors and correlation distance between vectors. Both approaches are powerful enough to recognize known evolutionary similarities, and yield similar inter-family relationships, but they also display important differences. These differences are shown to exist because the two methods recognize different aspects of the evolutionary profile. The inter-vector angle is an effective measure of the difference in the overall form of phylogenetic trees even in cases where the topology of the tree is not well-defined, whereas the correlation distance is especially effective in recognizing similarities in topology. When the protein families are clustered based on either the angle or the correlation distance between them, the cluster dendrogram shows a core cluster consisting of ancient protein families with the standard phylogeny. In addition, evolutionary profile comparison also detects plausible evolutionary similarities between unannotated proteins and proteins of known function. For instance, the bacterial yjeF gene and the ygjD/ydiE gene are both predicted to be involved in cell envelope biogenesis. In summary, we describe quantitative comparisons of protein family specific evolutionary profiles, and illustrate their power in detecting broader evolutionary trends and specific functional relationships between proteins.
-
-
-
Annotating Viral Genomes - A Cannon is Needed to Kill Mosquitoes
More LessThe majority of viruses have a small genome. However, these small genomes often have complex gene features with transcriptional and translational exceptions, for instance, gene overlapping, alternative splicing, RNA editing, ribosomal slippage and stop codon read-through. These complex features and exceptions increase gene density and improve the gene coding efficiency of viral genomes. They also pose immense challenges to gene prediction algorithms. Most gene prediction programs for eukaryotic and prokaryotic genomes cannot detect or predict these exceptions correctly. It is critical to predict these complex features and exceptions with high precision and accuracy in order to interpret viral genomic data correctly. This paper describes the most commonly used programs for viral gene predictions, focusing on the ab initio and similarity-based gene prediction programs, including GeneMarkS, ZCURVE_V, FgenesV, Phylo-HMM, MLOGD, GATU, VirGen, FLAN, VIGOR and others. Viral genome complex features and the basic algorithms of the gene prediction programs are introduced briefly, with identification of advantages and disadvantages, followed by a list of application scopes and specific features. Gene prediction programs for bacteriophages and viral meta-genomic sequences are reviewed separately. The last section of this review presents the future directions and challenges for viral gene prediction program development.
-
-
-
Identification of Marker Genes for Cancer Based on Microarrays Using a Computational Biology Approach
More LessRapid advances in gene expression microarray technology have enabled to discover molecular markers used for cancer diagnosis, prognosis, and prediction. One computational challenge with using microarray data analysis to create cancer classifiers is how to effectively deal with microarray data which are composed of high-dimensional attributes (p) and low-dimensional instances (n). Gene selection and classifier construction are two key issues concerned with this topics. In this article, we reviewed major methods for computational identification of cancer marker genes based on microarray gene expression data. We concluded that simple methods should be preferred to complicated ones for their interpretability and applicability.
-
-
-
A Review of the Computational Methods for Identifying the Over- Annotated Genes and Missing Genes in Microbial Genomes
Authors: Jia-Feng Yu, Zhen-Zhen Guo, Xiao Sun and Ji-Hua WangMore and more studies indicate that the issue of protein-coding gene finding in microbial genomes is far from thoroughly solved and the annotation quality has been questioned continuously in the past several years. In this paper, we summarize the computational methods for identifying the over-annotated genes and missing genes, and provide perspective for prospective gene finding works.
-
-
-
Prediction of Translation Initiation Site in Bacterial and Archaeal Genomes
Authors: Huaiqiu Zhu and Qi WangDriven by the rapid growth of the complete genome sequences, it is accepted that genome annotation has been resorted mostly to automatic methodology. For computational annotation to bacterial and archaeal genomes, accurate prediction of translation initiation sites (TISs) is essential to locate protein coding regions of genes. Therefore, TIS prediction has been a challenge to a number of gene finders and TIS processors, leading to recent studies of TIS prediction or correction in prokaryotic genome annotation as well as of the mechanism of translation initiation. It is time for the research community to review the available mathematical models of TIS of prokaryotic gene, and the resultant algorithms for a series of current TIS processors and TIS prediction modules in gene finders. In fact, the TIS models have been improved along with the knowledge of the mechanism of translation initiation. Several studies of the mechanism of translation initiation in prokaryotic genomes have been summarized. With a few of published data sets widely-used in evaluation of TIS identification, the performances of the existing methods are assessed and discussed in this article. It is also interesting to discuss the relation between the algorithms and the understanding of prokaryotic translation initiation mechanism, which can enlighten us on the state-of-the-art studies of TIS prediction in bacterial and archaeal genomes.
-
-
-
Prediction and Classification of ABC Transporters in Geobacter sulfurreducens PCA Using Computational Approaches
Authors: Ashok Selvaraj, Venil Sumantran, Nupoor Chowdhary and Gopal Ramesh KumarGeobacter sulfurreducens PCA plays an important role in electricity production and bioremediation, as it can reduce uranium to an insoluble form, and uses organic compounds as electron donors. ATP-binding cassette (ABC) transporters are important as they regulate respiration and biofilm formation, which in turn affect the rate of electricity production and bioremediation. Thus we focused on identifying ABC transporters by functional genomic re-annotation, KEGG pathway analysis, and phylogenetic analysis of hypothetical proteins of G. sulfurreducens PCA. Our prediction is based on five 1-dimensional tools including BLAST, family, domain, orthologous groups and signature recognition search. We define a prediction of gene function with high confidence, when 3 or more functional prediction tools indicated the same function for a given hypothetical gene. From our integrated re-annotation approach, we predicted eleven new ABC transporters and phylogenetically sub-classified these genes based on sequence similarity with known sub-classes of ABC transporters. These eleven new ABC transporters were also identified by KEGG pathway analysis. Overall, these 11 newly predicted ABC proteins can be sub-classified as five ABC transporter substrate binding proteins, three ABC-2 type transporters, 2 permeases, and 1 phosphate transporter.
-
-
-
A Survey on Computational Approaches to the Discovery of microRNA Genes
By Ki-Bong KimFor quite a while, the main focus of molecular biology has been on DNA, being the carrier of the genetic code, with RNA being viewed merely as an intermediary player. However, lately it has become obvious that RNA plays much more important roles in the cellular regulatory mechanisms. The discovery of various new types of RNA has provided a further boost to RNA research. As a result, research into small regulatory RNA molecules and in particular microRNA (miRNA) has experienced an exponential gain in attention. miRNAs are short non-coding RNAs that regulate gene expression at the post-transcriptional level by directly cleaving targeted mRNAs or repressing translation. They are now recognized as one of the key regulators of gene expression, involved in almost every aspect of a cell life from cell differentiation to apoptosis. Since the discovery of the very first miRNAs, lin-4 and let-7, computational methods have been indispensable tools that complement experimental approaches to understand the biology of miRNAs. Computational approaches for miRNA studies can be classified into two main categories - miRNA gene finding and miRNA target prediction. This review focuses on miRNA gene finding, not miRNA target prediction that has been thoroughly reviewed in [1-5]. First, this paper briefly introduces the biological features of miRNA genes and summarizes the basic principles of in silico prediction. Next, concluding with some outlook and remarks, it provides a comprehensive survey of specific methods that have been proposed in the field.
-
-
-
Unveiling Molecular Basis of Fertilisation in Scleractinian Corals Using Extensive Genomic Information
By Akira IguchiMany aspects of the reproductive biology of scleractinian corals remain unknown. External fertilisation during broadcast spawning events generates the potential for interspecific hybridisation, yet whilst efficient cross-fertilisation often occurs in vitro and species boundaries are blurred due to introgression, in nature hybridisation happens rarely even between sympatric and highly cross-fertile species. One potential explanation of the discrepancy between the observed and potential level of hybridisation is temporal partitioning. An essential first step towards resolving this apparent paradox is an understanding of the molecular basis of fertilisation in corals. Here I summarise those aspects of fertilisation mechanisms in some of the best-characterised animal systems (Vertebrata, Mollusca, Echinodermata) under the premise that these may provide insight into the mechanisms dictating fertilisation in spawning corals. This leads to propose that interactions involving integrins, ADAM family proteins and modifiers/co-receptors may underlie gamete interactions in corals. The identification of fast-evolving genes such as ADAMs also promises to provide candidates for roles in coral fertilisation. Overall in this review I describe how new genomic information can shed light on the molecular basis of coral fertilisation and this will help progress our understanding of the reproductive systems in this keystone group of reefbuilding organisms.
-
-
-
DNA Physical Parameters Modulate Nucleosome Positioning in the Saccharomyces cerevisiae Genome
Authors: Wei Chen, Hao Lin and Pengmian FengNucleosome positioning plays essential roles in various cellular processes. Although many efforts have been made in this area, the rules defining nucleosome positioning is still elusive. In the present study, DNA physical parameters derived from atomistic molecular dynamic simulations were introduced to analyze nucleosomal and linker DNA sequences. The distinct structural patterns between nucleosomal and linker sequences indicate that DNA physical parameters are suitable to describe nucleosomal DNA sequences and to reveal physical mechanisms of nucleosome positioning. Further analysis of DNA flexibility around regulatory regions indicates that nucleosome positioning is closely correlated with sequence flexibility. These results demonstrate that DNA physical parameters are useful for the in silico nucleosome positioning prediction.
-
-
-
A Novel Average Measure Approach to the Identification of Native-Like Protein Structures Among Decoy Sets
Authors: Juan Li, Caiyun Fang and Huisheng FangIt is a great challenge to predict a protein structure and this challenge has fascinated researchers in different disciplines for many years. Basically the prediction process mainly includes two steps. With the first step that the generation of prediction model increasing fast, the second step that the quality estimation of predicted model i.e. identification of models’ native like structure becomes more and more important. In this study, we developed a simple and effective approach to identify the native-like protein structures among a set of decoys. Three different average measures were used in our study as follows: the average rmsd (armsd), the average alignment score (AAS) and MAXSUB. This approach was evaluated by decoy set (Park-Levitt). Comparison of model quality revealed that a significant correlation existed between these parameters. For example, the average measure could be effectively used to identify native-like protein models. The performance of both armsd and AAS was better than that of clustering. Since many other measures could be used to assess the similarity between protein structures, other analogous approaches might be also useful for the identification of native-like proteins. Finally, data showed that its performance was better than that of other servers in predicting the targets in CASP6, CASP7, CASP9 and CASP10.
-
-
-
Exploring the Regulation Mechanism of miRNAs Transcription Using Genomic and Epigenetic Functional Annotations
Authors: Lihua Xie, Chun Liu, Honghui Yang, Hua Lin and Shenghua SunUncovering the transcriptional regulation mechanism of mammalian miRNAs is crucial to understand the role of these tiny regulators in cellular processes. Based on 1,030 genomic and epigenetic functional annotations, we compared these features of regulatory regions between miRNAs and protein-coding genes. Support vector machine (SVM) was used to quantify the contribution of each annotation group or group combination in distinguishing miRNA regulatory regions. We observed fewer repetitive DNA elements and SNPs, but more intensive DNA methylation in the miRNA regulatory regions. On the contrary, there are more predicted CpG islands and higher H3K9me1 levels in the regulatory regions of protein-coding genes. This analysis indicated that epigenetic factors such as DNA methylation and specific histone marks are the most informative groups, suggesting epigenetic events may play a more important role in miRNA transcription than previously thought. Furthermore, these results also revealed the interactive effects among various genomic or epigenetic factors involved in miRNA transcription.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
