Volume 8, Issue 3

Current Bioinformatics - Volume 8, Issue 3, 2013

Volume 8, Issue 3, 2013

- Editorial (Hot Topic: Systematic Analysis of Biological Networks)
  
  Authors: Young-Rae Cho and Pietro H. Guzzi
  
  https://doi.org/10.2174/1574893611308030001
  More Less
  
  Add to my favourites
  
  Email this

- Detection of Protein Complexes Using Hierarchical Link Clustering and Core-Attachment Structure§
  
  Authors: Yinhai Liu, Chengjie Sun, Yang Yu, Lei Lin and Xiaolong Wang
  
  https://doi.org/10.2174/1574893611308030002
  More Less
  
  Identifying protein complexes from protein-protein interact ion (PPI) networks is an important issue in proteomics and bioinformatics. And various computational methods have been developed to solve it. In this paper, an approach called Hierarchical Link Clustering and Core-Attachment (HLC-CA) was proposed to detect protein complexes by integrating an HLC algorithm and the immanent core-attachment structure in protein complexes. Compared with other methods, HLC-CA has a low time complexity and few parameters to tune. HLC-CA includes four steps. Firstly, an HLC algorithm was used to obtain candidate clusters. Secondly, a density threshold was employed to filter the clusters in ord e r to identify complex cores. Thirdly, each core was recruited attachments by introducing the closeness. Finally, the cores chosen in the second step and their corresponding attachments were used to compose protein complexes. Evaluation results show that the proposed HLC-CA method outperforms most of the state-of-the-art methods.
  
  Add to my favourites
  
  Email this

- Metabolic Network Analysis: Current Status and Way Forward
  
  Authors: Shweta Kolhi and Aahok S. Kolaskar
  
  https://doi.org/10.2174/1574893611308030003
  More Less
  
  One of the fundamental aims of life science is to gain insights into the functioning of an organism at systems level. Generation and systematic storage of enormous biological data in the post genomic era has made systems level studies a reality. For studies involving systems level investigation, metabolic pathways data inferred from intricate interactions amongst genes/enzymes/proteins are best suited and are being used extensively as they represent dynamic interactions in an organism. Consequently research in the field of comparative metabolomics as well as metabolic networks analysis is undergoing rapid improvement. Although the efforts to analyze metabolome have increased in recent years, our knowledge pertaining to its design principles is very limited. Various methodologies to compare and align metabolic pathways have been put forth and are discussed in this review. Further, graph theoretic approaches are undertaken with an aim to unveil the universal laws governing the complex metabolic networks. New algorithms that negate the abstraction from earlier studies are the need of the hour. One such approach termed “metabolic categorization” that helps in understanding the functionality of each metabolic pathway at systems level is discussed in this review. Finally, extension of linguistic approaches from genome and proteome to metabolome is suggested in order to simplify the understanding of a living system.
  
  Add to my favourites
  
  Email this

- Protein Modules Detection Based on Subcellular Information
  
  Authors: Yang Yu, Lei Lin, Chengjie Sun, Xiaolong Wang and Xuan Wang
  
  https://doi.org/10.2174/1574893611308030004
  More Less
  
  Protein modules detection from protein-protein interaction network is the hot topic in the biological information process. In this paper, we present a rank strategy for deriving protein complex, in which both subcellular information and topological information of the network are combined. First, we locate the clusters based on the competing methods from protein-protein interaction network as candidate clusters and rank these clusters based on link density calculated from the localization matrix. Second, compared with four original methods, the experimental results demonstrate that our rank strategy can improve the performances of the four original methods and is robust to all the similarity scores. Finally, the integration of the protein co-cocalizaiton information can reduce false positive percentage, especially for the extracted protein complexes only from protein-protein interaction network. Furthermore, detailed comparison with functional annotations illustrates and certifies the efficiency of the spatial information and this strategy is indicated to be helpful to find functional modules.
  
  Add to my favourites
  
  Email this

- Mining of Network Markers for Brain Tumor from Transcriptome and Interactome Data
  
  By Jongkwang Kim
  
  https://doi.org/10.2174/1574893611308030005
  More Less
  
  Glioblastoma multiforme (GBM: grade IV astrocytoma) is the most common but lethal form of brain cancer. The median survival time of GBM patients is only 15 months. Only a few predictive markers have been reported for prognosis and treatment. This study integrates gene expression and protein-protein interaction data to search for pathways that are differentially regulated between long-term and short-term survivors of GBM patients. A novel objective function for greedy search was introduced in search for 47 significantly and differentially expressed sub-networks (SDES) or pathways in a greedy fashion. The resultant putative pathways (involving 156 genes) were tested for enrichment of known GBM cancer genes as well as GO terms related to “biological process.” Integration of gene expression profiles of GBM patients with a PPI network improves the recall rate of known GBM driver genes and shows the better GO enrichment in comparison to the conventional gene-set approach that is based solely on the expression data.
  
  Add to my favourites
  
  Email this

- On the Discovery of Cellular Subsystems in Gene Correlation Networks Using Measures of Centrality
  
  Authors: Kathryn M. Dempsey and Hesham H. Ali
  
  https://doi.org/10.2174/1574893611308030006
  More Less
  
  Innovative models for analyzing high-throughput biological data are becoming of great significance in the post genomic era. Correlation networks are rapidly becoming powerful models for representing various types of biological relationships especially in the case of extracting knowledge from gene expression data. Data analysis using of other popular networks models in biology have revealed that structures within a graph model, such as high degee nodes and cliques, often correspond to cellular functions. Correlation networks, which can be used to measure the relationships between patterns of gene expression, are capable of representing entire-genome expression assays. In this study we build correlation networks from gene expression datasets available in the public domain; once built, we are able to identify graph theoretic structures (critical nodes and dense subgraphs) and use measures of centrality to infer the biological impact of these structures within the network. We go on to validate the link between network components (such as critical nodes and degrees) and biological function of the model by exploring the biological properties of a set of nodes with high centrality measures in the correlation. In addition, we use network integration to identify essential genes in an integrated correlation network obtained by the union of networks of mice with different age groups. By examining clusters connected by highly central nodes in this integrated network, we were able to find a set of essential genes and identify several cellular subsystems that point towards aging related mechanisms. The obtained results provide clear evidence that correlation networks represent a powerful tool for analyzing temporal biological data and consequently make use of the wealth of gene expression assays currently available.
  
  Add to my favourites
  
  Email this

- Systematic Analysis of Interactomes in Sequence Properties Space
  
  By Maria Persico
  
  https://doi.org/10.2174/1574893611308030007
  More Less
  
  A number of representations of protein networks have been reported. Further, while the existence of multiple types of interactomes and relationships between proteins has been accepted and discussed extensively, the exploration of these concepts and hypotheses using machine learning frameworks for protein interaction prediction in a multi-class setting has not yet been extensively accomplished. Essentially, this is due to two reasons: the missing values issues in features and the heterogeneity and not always clear annotation of protein interaction data. This has motivated the attempt to build a set of universal features attributable to any set of protein pairs, generating a universal feature space where evolutionary constraints show their effects and play a central role. We have called this space and the features generating it respectively the sequence properties space and the derived features. We have probed an integrated version of sequence properties space in its ability to properly represent the different kind of available interactomes.
  
  Add to my favourites
  
  Email this

- Improving Functional Modules Discovery by Enriching Interaction Networks with Gene Profiles
  
  Authors: Saeed Salem, Rami Alroobi, Shadi Banitaan, Loqmane Seridi, Ibrahim Aljarah and James Brewer
  
  https://doi.org/10.2174/1574893611308030008
  More Less
  
  Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.
  
  Add to my favourites
  
  Email this

- Predicting False Positives of Protein-Protein Interaction Data by Semantic Similarity Measures§
  
  Authors: George Montanez and Young-Rae Cho
  
  https://doi.org/10.2174/1574893611308030009
  More Less
  
  Recent technical advances in identifying protein-protein interactions (PPIs) have generated the genomic-wide interaction data, collectively collectively referred to as the interactome. These interaction data give an insight into the underlying mechanisms of biological processes. However, the PPI data determined by experimental and computational methods include an extremely large number of false positives which are not confirmed to occur in vivo. Filtering PPI data is thus a critical preprocessing step to improve analysis accuracy. Integrating Gene Ontology (GO) data is proposed in this article to assess reliability of the PPIs. We evaluate the performance of various semantic similarity measures in terms of functional consistency. Protein pairs with high semantic similarity are considered highly likely to share common functions, and therefore, are more likely to interact. We also propose a combined method of semantic similarity to apply to predicting false positive PPIs. The experimental results show that the combined hybrid method has better performance than the individual semantic similarity classifiers. The proposed classifier predicted that 58.6% of the S. cerevisiae PPIs from the BioGRID database are false positives.
  
  Add to my favourites
  
  Email this

- Semantic Similarities as Discriminative Features of Protein Complexes
  
  Authors: Pietro Hiram Guzzi, Marianna Milano, Pierangelo Veltri and Mario Cannataro
  
  https://doi.org/10.2174/1574893611308030010
  More Less
  
  Biological data about genes, proteins and biologically relevant molecules that are stored in databases may be associated to biological information (knowledge) such as experiments, properties and functions, response to drugs etc. Such knowledge is formally structured into ontologies that provide the best formalize to organize and store knowledge. In the biological field, Gene Ontology (GO) provides both a categorization of annotating terms and a source of annotation for genes and proteins. Consequently it is possible to introduce novel methodologies of analysis that are based on the use of ontologies. Recently a growing interest has caputed semantic similarities, i.e. the calculation of the similarity of two or more proteins starting from their annotations. For instance semantic measures have been used for the prediction of protein complexes. Although the importance of these researches, some problems remain still unsolved: the assessment of semantic measures with respect to biological features as well as a deep study on the impact of the chosen measure in the obtained results. This paper focus on the use of semantic similarity measures into the protein complexes prediction pipeline. For these aims we investigated if there exists a bias among different measures as well as a higher value of semantic similarity within proteins that participate in the same complex. Results confirm that protein belonging to the same complex have a bigger average values of semantic similarity with respect to the average values of the proteomes. This confirm a possible use of semantic similarity measures within protein complexes prediction algorithms and a way to choose the best one among them.
  
  Add to my favourites
  
  Email this

- Fractal Analysis of Epithelial-Connective Tissue Interface in Basal Cell Carcinoma of the Skin
  
  Authors: Giorgio Bianciardi, Clelia Miracco, Stefano Lazzi and Pietro Luzi
  
  https://doi.org/10.2174/1574893611308030011
  More Less
  
  This paper investigates the use of computerized fractal analysis for objective characterization of the complexity of the epithelial-connective tissue interface in basal cell carcinoma and the ability of the technique to quantitatively discriminate among different diagnostic categories. Tumor boundaries were extracted by means of image analysis. The fractal dimension was calculated by using the box-counting method. The results showed that the shape of the boundaries between epithelium and stroma is significantly more complex in infiltrative high risk tumors than in circumscribed low risk ones (p<0.001), with 100% correct classifications. This study shows that the computerized fractal analysis of epithelial-connective tissue interface in basal cell carcinomas can provide an accurate, quantitative, inexpensive technique to help in tumor diagnosis.
  
  Add to my favourites
  
  Email this

- An In Silico Identification of Human Promoters: A Soft Computing Based Approach
  
  Authors: Sutapa Datta and Subhasis Mukhopadhyay
  
  https://doi.org/10.2174/1574893611308030012
  More Less
  
  Promoter region of a gene sequence of Eukaryotes is very important as it helps us to understand the mechanism of transcription regulation. The identification of this region is a complex problem as the signature for identification turns out to be fuzzy. Several in silico methods are available for identifying the promoter region, but the scope for new methods still exists. Reasonable prediction of promoter sequence (that can be tested by comparing with the wet-lab data) from a mixed database of promoters and nonpromoters is thus a challenge that any new method would have to face. In this communication we propose a composite method that utilizes clustering of known promoter and non-promoter sequences in their respective clusters based on their relative distances, and then classifying the max similarity scores obtained from a group of new sequences and the clusters, to predict the true promoters among the new set of sequences. The in silico experiment is carried out on different databases constructed by us from the available primary sequence databanks to demonstrate the advantage of the proposed approach.
  
  Add to my favourites
  
  Email this

- Hidden Markov Model for Splicing Junction Sites Identification in DNA Sequences
  
  Authors: Srabanti Maji and Deepak Garg
  
  https://doi.org/10.2174/1574893611308030013
  More Less
  
  Identification of coding sequence from genomic DNA sequence is the major step in pursuit of gene identification. In the eukaryotic organism, gene structure consists of promoter, intron, start codon, exons and stop codon, etc. and to identify it, accurate labeling of the mentioned segments is necessary. Splice site is the ‘separation’ between exons and introns, the predicted accuracy of which is lower than 90% (in general) though the sequences adjacent to the splice sites have a high conservation. As the accuracy of splice site recognition has not yet been satisfactory (adequate), therefore, much attention has been paid to improve the prediction accuracy and improvement in the algorithms used is very essential element. In this manuscript, Hidden Markov Model (HMM) based splice sites predictor is developed and trained using Modified Expectation Maximization (MEM) algorithm. A 12 fold cross validation technique is also applied to check the reproducibility of the results obtained and to further increase the prediction accuracy. The proposed system can able to achieve the accuracy of 98% of true donor site and 93% for true acceptor site in the standard DNA (nucleotide) sequence.
  
  Add to my favourites
  
  Email this

- A New Integration-Centric Algorithm of Identifying Essential Proteins Based on Topology Structure of Protein-Protein Interaction Network and Complex Information
  
  Authors: Jiawei Luo and Ling Ma
  
  https://doi.org/10.2174/1574893611308030014
  More Less
  
  Essential proteins are necessary for the survival and development of organism. Many computational approaches have been proposed for predicting essential proteins based on protein-protein interaction (PPI) network. In this paper, we propose a new centrality algorithm for identifying essential proteins, named CSC algorithm. CSC algorithm integrates topology character of PPI network and in-degree of proteins in complexes. We use CSC algorithm to identify the essential proteins in PPI network of Saccharomyces cerevisiae. The results show that the ratio of identified essential proteins on CSC algorithm is higher than other ten centrality methods: Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), Bottle Neck (BN), Local Average Connectivity-based method (LAC), Sum of ECC (SoECC) and PeC. Particularly, the identification accuracy of CSC algorithm is more than 40% over the six classic centrality measures (DC, BC, CC, SC, EC, IC).
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 8, Issue 3, 2013

Volume 8, Issue 3, 2013

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed