Current Bioinformatics - Volume 1, Issue 2, 2006
Volume 1, Issue 2, 2006
-
-
Recent Advances in RNA Secondary Structure Prediction with Pseudoknots
More LessIt has recently been recognized that pseudoknots of RNAs have important roles and not a small number of RNAs contain pseudoknots. Therefore, recent studies on RNA secondary structure prediction focus on pseudoknots. Several algorithms have been developed based on dynamic programming. Though optimality of a solution is guaranteed, these algorithms suffer from high time complexities. Thus, heuristic algorithms have also been developed, some of which are supposed to produce near optimal solutions in reasonable CPU time. Recently, another approach based on comparative modeling has been proposed. Several practical programs and web-based server programs are also developed based on the above-mentioned algorithms. The purpose of this review paper is to introduce the basic ideas in important algorithms for RNA secondary structure prediction with pseudoknots. This paper also tries to reveal relations among important algorithms.
-
-
-
Computational Analyses of Ancient Polyploidy
Authors: Kevin P. Byrne and Guillaume BlancWhole genome duplication has played a major role in the evolution of many eukaryotic lineages. Polyploidy has long been postulated as a powerful mechanism for evolutionary innovation, and recent analyses have provided convincing evidence that independent ancient genome duplications occurred in the ancestors of yeast, plants, vertebrates and fish. It is the growing availability of whole genome sequences that has facilitated the detection and analysis of these polyploidizations. However, because polyploidy is often followed by massive gene loss and chromosomal rearrangements, identifying such events is not always easy. Here is presented a review of a wide array of computational methods of ever-increasing sophistication developed to identify the obscured traces of ancient polyploidy events in genomic sequences. These methods use a variety of analytical approaches, including comparative genomics, phylogenetics and molecular clock analyses. We have also reviewed recent research on the long-term evolution of genes and genomes duplicated by polyploidization. This has emerged as a fruitful field, utilizing genome-wide functional information and genomic sequence data to further our understanding of the impact of polyploidy on organismal biology and evolution.
-
-
-
Principles and Practices of Pathway Modelling
Authors: Karthik Raman, Preethi Rajagopalan and Nagasuma ChandraThe potential of systems-based approaches are increasingly being realised in drug discovery, metabolic engineering and related areas. Developments in high-throughput experimental techniques and explosion of genomic data have fuelled progress in this area. Modelling and simulation of metabolic and regulatory pathways is an important step in systems analysis. In this review, we discuss the principles of pathway modelling, simulation techniques and current practices. A pre-requisite for modelling and simulating metabolic pathways is an accurate description of the pathway landscape. Despite availability of hundreds of annotated genome sequences, accurate information about pathways is still largely incomplete. We highlight some of the methods for deriving pathway landscapes from biochemical literature and high-throughput experimental data. The conceptual framework for modelling in terms of abstraction levels and schema for representation is also presented. Next, several classes of techniques available for modelling and simulating such systems formulated from pathway landscapes, viz. kinetic pathway modelling, interaction-based modelling and constraint-based modelling are discussed. The Systems Biology Markup Language as well as various pathway design and simulation tools are reviewed. The usefulness of various concepts and methodologies in areas such as drug discovery and metabolic engineering are illustrated with examples from literature, with a note on future perspectives.
-
-
-
Is There a Real Bayesian Revolution in Pattern Recognition for Bioinformatics?
More LessRecently, Bayesian statistical thinking has been considered as a revolutionary force within genetics and bioinformatics. Novel computational algorithms have enabled use of probability models of unprecedented degree of complexity in many applications. Pattern recognition within bioinformatics is a multifaceted field which poses an enormous challenge for the Bayesian approach to data analysis. Advantages of this framework have been demonstrated for, e.g., de novo identification of gene regulatory binding motifs, identification of gene regulatory networks, and unsupervised classification of molecular marker data. However, as complexity of data sets in bioinformatics is continuously increasing, it is likely that the conventional approaches to Bayesian computation will not yield feasible solutions in the future. Even currently, many large-scale problems are analyzed using traditional algorithmic solutions due to the exhaustive human and computing resources required by the Bayesian methods. The generic benefits of solid Bayesian modelling have been clearly demonstrated in the theoretical literature. Therefore, it would be ideal if the Bayesian modelling and computational strategies would rapidly evolve, to meet the demand from the users of extensively increasing amount of molecular information. Here we discuss potential courses for such an evolution, which could help to really revolutionize statistical thinking in pattern recognition within bioinformatics.
-
-
-
Intervention in Probabilistic Gene Regulatory Networks
Authors: Aniruddha Datta, Ranadip Pal and Edward R. DoughertyIn recent years, there has been a considerable amount of interest in the area of Genomic Signal Processing, which is the engineering discipline that studies the processing of genomic signals. Since regulatory decisions within the cell utilize numerous inputs, analytical tools are necessary to model the multivariate influences on decision-making produced by complex genetic networks. Signal processing approaches such as detection, prediction and classification have been used in the recent past to construct genetic regulatory networks capable of modeling genetic behavior. To accommodate the large amount of uncertainty associated with this kind of modeling, many of the networks proposed are probabilistic. One of the objectives of network modeling is to use the network to design different intervention approaches for affecting the time evolution of the gene activity profile of the network. More specifically, one is interested in intervening to help the network avoid undesirable states such as those associated with a disease. This paper provides a tutorial survey of the intervention approaches developed so far in the literature for probabilistic gene networks (probabilistic Boolean networks) and outlines some of the open challenges that remain.
-
-
-
Accomplishments and Challenges in High Performance Computing for Computational Biology
Authors: Zhihua Du, Feng Lin and Bertil SchmidtWe review recent research and development in high performance computing (HPC) for computational biology and discuss the great challenges to both biomedical scientists and IT professionals. During the last decades, research in the fields of molecular biology and biomedicine has provided the scientific community with huge amount of data through sequencing, genome-wide annotation and gene expression profiling projects. The genetic databases have been growing exponentially and sophisticated computer algorithms have been developed to cater for needs of data mining, analysis and simulation. It is clear that development of HPC technologies has become crucial for deployment of the software systems to tackle various bioinformatics problems. The goal of this article is to present the current research and our critical review on construction of parallel and distributed computing systems, design of multi-process algorithms, and development of software systems for biocomputing tasks including sequence alignment, heuristic database searching, phylogenetic analysis gene clustering. We also give a brief introduction to our work in development of highly scalable and reproducible HPC algorithms and indicate the challenging problems in this context.
-
-
-
Mining Protein-Protein Interaction Data
Authors: Ryan J. Haasl and Jianwen FangThe development of high-throughput technologies that expedite the discovery of interactions between proteins has made it possible to screen entire genomes and produce large protein-protein interaction (PPI) datasets. The availability of these datasets is now enabling researchers to perform PPI data mining activities of theoretical and practical importance, including prediction of novel PPIs and protein function, sub-cellular localization of proteins, and construction of reasonably realistic, proteome-wide PPI networks. Most newer methods of in silico PPI prediction hinge upon conserved sequence signatures discovered through the analysis of a large PPI dataset, although some methods attempt to improve predictive accuracy through the incorporation of additional biological information and/or multiple datasets. Though the protein interaction networks constructed to date do not provide a truly realistic picture of biological network mechanisms, they are functional in the sense that they have enabled researchers to test the reliability of high-throughput data, predict protein function, and localize proteins within the cell. All PPI data mining activities are constrained by the quantity and quality of the PPI data currently available. Consequently, the reliability of predictions based on PPI data is expected to increase as PPI databases increase in size and taxonomic range.
-
-
-
Computational and Statistical Methods to Explore the Various Dimensions of Protein Evolution
More LessPredicting genes and gene regions undergoing adaptive evolution is one of the most important aims of geneticists and of new emerging areas of investigation. As more genomes are being sequenced and computational tools to detect selection are being developed, the number of genes uncovered as being positively selected is overwhelming. Several statistical methods have been devised to test if specific amino acid regions have undergone adaptive mutations at some stage during the protein's evolution. Despite the sensitivity of these methods to detect selective constraints, they are still based on linear sequence alignments and therefore, examine only one dimension of the protein's evolution. Few methods have been designed to detect intra-molecular co-evolution between amino acid sites. However, no tests are performed to determine the adaptive value of these co-evolutionary events. Conclusions independently derived from both types of methods are ambiguous and seldom unequivocal, since evolution of protein sequences is most likely to be multifactorial. This review discusses and has briefly exposed the advantages and disadvantages of the many different methods and computational tools to detect adaptive evolution and co-evolution. Further, the potential that the combination of such methods has in providing more biologically meaningful results is highlighted.
-
-
-
Networks Everywhere? Some General Implications of an Emergent Metaphor
Authors: Maria C. Palumbo, Lorenzo Farina, Alfredo Colosimo, Kyaw Tun, Pawan K. Dhar and Alessandro GiulianiThe use of the term 'network' is more and more widespread in all fields of biology. It evokes a systemic approach to biological problems able to overcome the evident limitations of the strict reductionism of the past twenty years. The expectations produced by taking into considerations not only the single elements but even the intermingled 'web' of links connecting different parts of biological entities, are huge. Nevertheless, we believe the lack of consciousness that networks, beside their biological 'likelihood', are modeling tools and not real entities, could be detrimental to the exploitation of the full potential of this paradigm. In this mini review the basic concepts of network analysis are presented, together with the relationships linking network approach to other more established modeling tools as multivariate data analysis and differential equations. Some applications of network based modeling of different biological phenomena are reported as well and the specific advantages of adopting such strategies are stressed together with the inescapable limitations.
-
-
-
Towards a Phenotypic Semantic Web
More LessThe impact of the internet in Biology is undeniable. The next stage in the evolution of the Internet for biological and molecular resource discovery must be towards what has been described as a semantic Web, where not only humans but machines can make "biologically intelligent" decisions based on collections of authenticable assertions about biology and molecular sciences. This vision requires agreed common representations of data and metadata shared and processed by automated tools as well as by people. Ontologies have become an integral part of achieving this and transformed biological resource management. In this review, we describe the necessary transition steps from the initial conception of the internet to the realisation of the semantic web using as an example its application in phenotypic information construction and delivery. We review the different parts of the Semantic web, such as XML, metadata, RDF, OWL, digital signatures, ontologies and grids whilst concentrating on how ontology is applied in Biology and more specifically in phenotype annotation. Finally, we discuss how the semantic web will transform biological information management, retrieval and visualisation whilst ensuring the availability of high quality data of the correct type and format for the determination of model structures and biological systems.
-
-
-
Large Scale Protein Sequence Clustering - Not Solved But Solvable
By Antje KrauseProtein sequence clustering is one of the oldest problems addressed in the field of computational biology. Back in the 60s, when the first protein sequence database was published as printed version, Margaret Dayhoff defined the basic principles of this discipline with only a small number of sequences at hand. With up to a million sequences available in public databases nowadays and several well known methods for automatic grouping of proteins into somehow biologically meaningful families, subfamilies and superfamilies, the problem seems to be satisfactorily solved. Nevertheless, apart from the problem of handling such a huge amount of data, several pitfalls have emerged since Dayhoff's times: databases fill up as fast as genomes are sequenced and a great many of these sequences are fragmental or disappear again when identified as being transcripts of wrongly predicted genes or hypothetical products of pseudogenes. This article first reviews the different approaches developed during the last decades. These insights will then be used to point out possible challenges waiting in the future.
-
-
-
Software Analysis of Two-Dimensional Electrophoretic Gels in Proteomic Experiments
More LessTwo-dimensional gel electrophoresis in combination with mass spectrometry constitutes the backbone of proteomic analysis. With the availability of powerful software tools addressing the specific needs for analyzing twodimensional gels, several typical procedures have been elaborated. In the first part of this review, we will describe and discuss the procedure of analyzing two-dimensional electrophoretic gels consisting of (i) digitizing the gels, (ii) detecting and separating individual spots, (iii) background subtraction, (iv) creating a reference gel and (v) matching the spots to the reference gel, (vi) modifying the reference gel, (vii) normalizing the gel measurements for comparison, and (viii) calibrating for isoelectric point and molecular weight markers. In the next step, (ix) a database containing the measurement results is constructed and (x) data are compared by statistical and bioinformatic means. We compare the software currently available for performing these tasks in the light of recent benchmarking and standardization efforts. We also comment on the statistical means provided in the programs including t-test statistics, ANOVA, and additional software for comparing expression patterns in large gel datasets, including hierarchical clustering algorithms and selforganizing maps (SOMs).
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
