Current Genomics - Volume 10, Issue 6, 2009
Volume 10, Issue 6, 2009
-
-
Editorial [Hot topic: Genomic Signal Processing: Part 1 (Guest Editors: E.R. Dougherty, X. Cai, Y. Huang, S. Kim and R. Yamaguchi)]
Authors: E. R. Dougherty, X. Cai, Y. Huang, S. Kim and R. YamaguchiGenomic Signal Processing (GSP) has been defined as the analysis, processing, and use of genomic signals for gaining biological knowledge and the translation of that knowledge into systems-based applications, where by genomic signals we mean the measurable events, principally the production of mRNA and protein carried out within the cell. Owing to the defining role of DNA in the production of mRNA, the structural characterization of DNA is inevitably a part of GSP and, interestingly, signal processing methods are utilized in understanding DNA structure. A key goal of translational genomics is to discover families of genes or gene products that can be used to classify disease, thereby leading to molecular-based diagnosis and prognosis. A deeper goal is to characterize genomic and proteomic regulation, thereby leading to a functional understanding of disease and the development of systems-based medical solutions. GSP is growing in importance as an ever larger community is recognizing that accomplishing these goals requires various disciplines within or related to signal processing, including pattern recognition, prediction/estimation theory, information theory, dynamical systems, control theory, network modeling, and communication theory. In sum, systems biology and systems medicine demand deep understanding of systems theory. This inevitably entails the theory and methods of signal processing, which have been so successful in areas such as communications, and the related theory pertaining to the characterization and control of dynamical systems, without which one cannot even imagine our contemporary technological society. The purpose of this special issue is to bring some of the key developments in GSP to the wider genomics community. Owing to its grounding in systems theory and stochastic processes, GSP often requires mathematics beyond the level of that studied in undergraduate electrical engineering, or even undergraduate mathematics and statistics, and, therefore, as originally published in the scientific literature, is not accessible to many researchers in biology and medicine. Because systems biology and systems medicine will, ipso facto, have to rely on mathematical systems theory, this dichotomy is a problem that will have to be addressed in the future, from both educational and research perspectives; nonetheless, in a review format it is possible to communicate many of the basic ideas without recourse to the kind of full rigorous mathematical analyses required in original research.
-
-
-
Performance of Feature Selection Methods
Authors: Edward R. Dougherty, Jianping Hua and Chao SimaHigh-throughput biological technologies offer the promise of finding feature sets to serve as biomarkers for medical applications; however, the sheer number of potential features (genes, proteins, etc.) means that there needs to be massive feature selection, far greater than that envisioned in the classical literature. This paper considers performance analysis for feature-selection algorithms from two fundamental perspectives: How does the classification accuracy achieved with a selected feature set compare to the accuracy when the best feature set is used and what is the optimal number of features that should be used? The criteria manifest themselves in several issues that need to be considered when examining the efficacy of a feature-selection algorithm: (1) the correlation between the classifier errors for the selected feature set and the theoretically best feature set; (2) the regressions of the aforementioned errors upon one another; (3) the peaking phenomenon, that is, the effect of sample size on feature selection; and (4) the analysis of feature selection in the framework of high-dimensional models corresponding to high-throughput data.
-
-
-
Boolean Models of Genomic Regulatory Networks: Reduction Mappings, Inference, and External Control
By Ivan IvanovComputational modeling of genomic regulation has become an important focus of systems biology and genomic signal processing for the past several years. It holds the promise to uncover both the structure and dynamical properties of the complex gene, protein or metabolic networks responsible for the cell functioning in various contexts and regimes. This, in turn, will lead to the development of optimal intervention strategies for prevention and control of disease. At the same time, constructing such computational models faces several challenges. High complexity is one of the major impediments for the practical applications of the models. Thus, reducing the size/complexity of a model becomes a critical issue in problems such as model selection, construction of tractable subnetwork models, and control of its dynamical behavior. We focus on the reduction problem in the context of two specific models of genomic regulation: Boolean networks with perturbation (BNP) and probabilistic Boolean networks (PBN). We also compare and draw a parallel between the reduction problem and two other important problems of computational modeling of genomic networks: the problem of network inference and the problem of designing external control policies for intervention/altering the dynamics of the model.
-
-
-
Review of Peak Detection Algorithms in Liquid-Chromatography-Mass Spectrometry
Authors: Jianqiu Zhang, Elias Gonzalez, Travis Hestilow, William Haskins and Yufei HuangIn this review, we will discuss peak detection in Liquid-Chromatography-Mass Spectrometry (LC/MS) from a signal processing perspective. A brief introduction to LC/MS is followed by a description of the major processing steps in LC/MS. Specifically, the problem of peak detection is formulated and various peak detection algorithms are described and compared.
-
-
-
Hidden Markov Models and their Applications in Biological Sequence Analysis
More LessHidden Markov models (HMMs) have been extensively used in biological sequence analysis. In this paper, we give a tutorial review of HMMs and their applications in a variety of problems in molecular biology. We especially focus on three types of HMMs: the profile-HMMs, pair-HMMs, and context-sensitive HMMs. We show how these HMMs can be used to solve various sequence analysis problems, such as pairwise and multiple sequence alignments, gene annotation, classification, similarity search, and many others.
-
-
-
Inference of Gene Regulatory Networks Using Time-Series Data: A Survey
Authors: Chao Sima, Jianping Hua and Sungwon JungThe advent of high-throughput technology like microarrays has provided the platform for studying how different cellular components work together, thus created an enormous interest in mathematically modeling biological network, particularly gene regulatory network (GRN). Of particular interest is the modeling and inference on time-series data, which capture a more thorough picture of the system than non-temporal data do. We have given an extensive review of methodologies that have been used on time-series data. In realizing that validation is an impartible part of the inference paradigm, we have also presented a discussion on the principles and challenges in performance evaluation of different methods. This survey gives a panoramic view on these topics, with anticipation that the readers will be inspired to improve and/or expand GRN inference and validation tool repository.
-
-
-
Clustering Algorithms: On Learning, Validation, Performance, and Applications to Genomics
Authors: Lori Dalton, Virginia Ballarin and Marcel BrunThe development of microarray technology has enabled scientists to measure the expression of thousands of genes simultaneously, resulting in a surge of interest in several disciplines throughout biology and medicine. While data clustering has been used for decades in image processing and pattern recognition, in recent years it has joined this wave of activity as a popular technique to analyze microarrays. To illustrate its application to genomics, clustering applied to genes from a set of microarray data groups together those genes whose expression levels exhibit similar behavior throughout the samples, and when applied to samples it offers the potential to discriminate pathologies based on their differential patterns of gene expression. Although clustering has now been used for many years in the context of gene expression microarrays, it has remained highly problematic. The choice of a clustering algorithm and validation index is not a trivial one, more so when applying them to high throughput biological or medical data. Factors to consider when choosing an algorithm include the nature of the application, the characteristics of the objects to be analyzed, the expected number and shape of the clusters, and the complexity of the problem versus computational power available. In some cases a very simple algorithm may be appropriate to tackle a problem, but many situations may require a more complex and powerful algorithm better suited for the job at hand. In this paper, we will cover the theoretical aspects of clustering, including error and learning, followed by an overview of popular clustering algorithms and classical validation indices. We also discuss the relative performance of these algorithms and indices and conclude with examples of the application of clustering to computational biology.
-
Volumes & issues
-
Volume 26 (2025)
-
Volume 25 (2024)
-
Volume 24 (2023)
-
Volume 23 (2022)
-
Volume 22 (2021)
-
Volume 21 (2020)
-
Volume 20 (2019)
-
Volume 19 (2018)
-
Volume 18 (2017)
-
Volume 17 (2016)
-
Volume 16 (2015)
-
Volume 15 (2014)
-
Volume 14 (2013)
-
Volume 13 (2012)
-
Volume 12 (2011)
-
Volume 11 (2010)
-
Volume 10 (2009)
-
Volume 9 (2008)
-
Volume 8 (2007)
-
Volume 7 (2006)
-
Volume 6 (2005)
-
Volume 5 (2004)
-
Volume 4 (2003)
-
Volume 3 (2002)
-
Volume 2 (2001)
-
Volume 1 (2000)
Most Read This Month
