Volume 2, Issue 2

Current Bioinformatics - Volume 2, Issue 2, 2007

Volume 2, Issue 2, 2007

- Advances in Exploration of Machine Learning Methods for Predicting Functional Class and Interaction Profiles of Proteins and Peptides Irrespective of Sequence Homology
  
  Authors: Juan Cui, Lianyi Han, Honghuang Lin, Zhiqun Tang, Zhiliang Ji, Zhiwei Cao, Yixue Li and Yuzong Chen
  
  https://doi.org/10.2174/157489307780618222
  More Less
  
  Various computational methods have been used for predicting protein function from clues contained in protein sequence. A particular challenge is the functional prediction of proteins that show low or no sequence similarity to proteins of known function. Recently, machine learning methods have been explored for predicting functional class of proteins from a variety of sequence-derived structural and physicochemical properties independent of sequence similarity, which showed promising potential for a broad spectrum of proteins including those that show low and no similarity to other proteins. These methods can thus be explored as potential tools to complement similarity-based, clustering-based and structure-based methods for predicting protein function. This article reviews the strategies, algorithms, current progresses, available software and web-servers, and underlying difficulties in using machine learning methods for predicting the functional class of proteins and peptides, and protein-protein interactions. The reported prediction performances in the application of these methods are also presented.
  
  Add to my favourites
  
  Email this

- A Decade of Computing to Traverse the Labyrinth of Protein Domains
  
  By Rajani R. Joshi
  
  https://doi.org/10.2174/157489307780618213
  More Less
  
  Detection and characterization of structural domains of proteins is crucial for determination of its tertiary structure, elucidation of its functions and design and production of its biologically active analogs. Identification of domainsegments at the sequence level is also important in deciphering protein structural genomics and in evolutionary studies. The diversity of domain folds and sequences and high structural flexibility of the inter-domain linker regions pose great challenges for determination of multi-domain protein structures even from X-ray crystallographic or NMR spectroscopic data or by homology modeling. The problems get manifold in the absence of any such data or sequence homologies. Interestingly though, identification of protein domains is a unique research problem where ab-intio computational investigations supersede the experimental ones or offer better applications of the latter. Advancement of Bioinformatics and Computational Biology in post-genomic research has led to plethora of approaches, algorithms and web-server developments for prediction of protein domains using - 3D co-ordinates, partial structural information including secondary structure or only the primary sequence. Here we assess the state-of-art developments in the field. Trend-setting as well as widely used computational methods and web-servers/databases are reviewed here with a focus on their applicability, novelty and strength in mining the multiple features of sequence/structure that contribute to formation and distinctions and diversity of protein domains. Future possibilities of a unified system with optimal decision support are highlighted.
  
  Add to my favourites
  
  Email this

- Gene Set Enrichment Analysis (GSEA) for Interpreting Gene Expression Profiles
  
  Authors: Jing Shi and Michael G. Walker
  
  https://doi.org/10.2174/157489307780618231
  More Less
  
  Gene set enrichment analysis (GSEA) is a statistical method to determine if predefined sets of genes are differentially expressed in different phenotypes. Predefined gene sets may be genes in a known metabolic pathway, located in the same cytogenetic band, sharing the same Gene Ontology category, or any user-defined set. In microarray experiments where no single gene shows statistically significant differential expression between phenotypes, GSEA has identified significant differentially expressed sets of genes, even where the average difference in expression between two phenotypes is only 20% for genes in the gene set. The gene set identified in the first GSEA analysis (oxidative phosphorylation genes differentially expressed in diabetic versus non-diabetic patients) was subsequently confirmed by independent laboratory studies published in the New England Journal of Medicine. Since the first paper on GSEA was published, many extensions and alternative methods have been described in the literature. In this paper, we describe the original GSEA algorithm, subsequent extensions and alternatives, results of some of the applications, some limitations of the methods and caveats for users, and possible future research directions. GSEA and related methods are complementary to conventional single-gene methods. Single gene methods work best when individual genes have large effects and there is small variance within the phenotype. GSEA is likely to be more powerful than conventional single-gene methods for studying the large number of common diseases in which many genes each make subtle contributions. It is a tool that deserves to be in the toolbox of bioinformatics practitioners.
  
  Add to my favourites
  
  Email this

- Inference of Gene Regulatory Networks and its Validation
  
  By Fang-Xiang Wu
  
  https://doi.org/10.2174/157489307780618240
  More Less
  
  Genes encode proteins, some of which in turn regulate other genes. Such interactions make up a gene regulatory network. The understanding and unraveling of gene regulatory networks have been proven very useful in disease diagnosis and genomic drug design. Due to the complexity of gene regulatory networks, the completely understanding of their dynamics is difficult to achieve only through biological experiments without any computational aids. As a consequence, computational models for gene regulatory networks are indispensable. Recently a wide variety of different computational models have been proposed for inferring gene regulatory networks. This paper surveys some of computational models for inferring large gene regulatory networks, in particular, Boolean network model, differential/ difference equation models, and state-space models. Some advantages and disadvantages of these models are commented on. Some criteria for validating the inferred gene regulatory networks are also discussed from the bioinformatics perspective. Finally, several directions of the future work for modeling gene regulatory networks are proposed.
  
  Add to my favourites
  
  Email this

- Spectral Estimation Techniques for DNA Sequence and Microarray Data Analysis
  
  Authors: Hong Yan and Tuan D. Pham
  
  https://doi.org/10.2174/157489307780618259
  More Less
  
  Spectral estimation techniques are widely used in modern signal processing systems. Recently, they have found important applications to the analysis of DNA data. In this paper, we review parametric and non-parametric spectral estimation methods for DNA sequence and microarray data analysis. The discrete Fourier transform (DFT) is the most commonly used technique for spectral analysis of digital signals. It can reveal the gene locations in a DNA sequence. The DFT can also be used to detect repetitive elements in a DNA sequence. The DFT produces the so-called windowing or data truncation artifacts when it is applied to a short data segment. Parametric spectral estimation methods, such as the autoregressive (AR) model, overcome this problem and can be used to obtain a high-resolution spectrum of the input signal. In this paper, we demonstrate the advantages of the AR model for the identification of protein coding regions and the detection of DNA repeats. We also review DFT and AR models and other spectral estimation techniques for the analysis of microarray time series data.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 2, Issue 2, 2007

Volume 2, Issue 2, 2007

Advances in Exploration of Machine Learning Methods for Predicting Functional Class and Interaction Profiles of Proteins and Peptides Irrespective of Sequence Homology

A Decade of Computing to Traverse the Labyrinth of Protein Domains

Gene Set Enrichment Analysis (GSEA) for Interpreting Gene Expression Profiles

Inference of Gene Regulatory Networks and its Validation

Spectral Estimation Techniques for DNA Sequence and Microarray Data Analysis

Volumes & issues

Volume 20 (2025)

Volume 19 (2024)

Volume 18 (2023)

Volume 17 (2022)

Volume 16 (2021)

Volume 15 (2020)

Volume 14 (2019)

Volume 13 (2018)

Volume 12 (2017)

Volume 11 (2016)

Volume 10 (2015)

Volume 9 (2014)

Volume 8 (2013)

Volume 7 (2012)

Volume 6 (2011)

Volume 5 (2010)

Volume 4 (2009)

Volume 3 (2008)

Volume 2 (2007)

Volume 1 (2006)

Most Read This Month

Most Cited Most Cited RSS feed

A Review of Ensemble Methods in Bioinformatics

Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis

Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification

A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods

Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation

A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization

Cancer Diagnosis Through IsomiR Expression with Machine Learning Method

The Advances and Challenges of Deep Learning Application in Biological Big Data Processing

Relevance of Molecular Docking Studies in Drug Designing

Gene Expression Profile Classification: A Review