Current Bioinformatics - Volume 12, Issue 1, 2017
Volume 12, Issue 1, 2017
-
-
A Systems Biology Perspective on Molecular Cytogenetics
Authors: Henry H.Q. Heng and Sarah ReganBackground: With the rapid progress of various large scale sequencing and -omics technologies, molecular cytogenetics is entering a new era: Systems-integrated cytogenetics. Since the genome context rather than gene content defines the genome system by providing system inheritance, and genomic inheritance is less precise than we expected, the new cytogenetics will play an important role in advancing systems biology. Objective: In this perspective, we briefly review some of the progress and limitations of the gene- or pathway- focused system approaches, including bioinformatics, and call for using the genome theory to integrate molecular cytogenetics and systems biology. Method: By highlighting some recent developments from cytogenetics/cytogenomics and systems biology, we try to synthesize the new emergent field: systems-integrated cytogenetics. Results: We explain why cytogenetics/cytogenomics needs a systems biology perspective, and what the major challenges and future directions are for the field of cytogenetics. Conclusion: The future cytogenetics needs a new genome-based conceptual framework such as the genome theory. The integration of cytogenetics/cytogenomics with systems biology will mutually benefit both fields.
-
-
-
First Molecular Cytogenetic Characterization of Murine Malignant Mesothelioma Cell Line AE17 and In Silico Translation to the Human Genome
Background: Mesothelial cells can be malignantly transformed e.g. due to previous asbestos exposure and form a clinically aggressive tumor called malignant mesothelioma (MM). Presently, there is extensive ongoing research in MM with the goal to identify prognostic factors and new therapeutic targets. Accordingly, well studied model systems, like cell lines, are urgently needed. Interestingly, murine MM cell lines models were established in the 1990s; however, they were not characterized genetically in any detail, yet. Objective: Provide first genetic characterization of murine MM cell line AE17, translate into human genome and characterize the human subtype of MM AE17 is suited as a model. Method: AE17 was studied on chromosomal level by molecular cytogenetics and array comparative genomic hybridization. Results and Conclusion: AE17 did not tetraploidize yet, has a basic karyotype of 40 chromosomes with only 3 balanced inversions, one balanced translocation and five chromosomes with simple to complex rearrangements, leading in the end to partial chromosomal deletions. Besides one stemline, three additional subclones could be observed. The obtained data was, by means of bioinformatics based on silico translation of the detected imbalances and observed chromosomal breakpoints, translated to the human genome. The obtained data suggests that AE17 is a well suited cell line model for MM with (cyto)genetic changes characteristic for sarcomatoid MM form. Furthermore, genes ESR2 and BAK1 seem to be activated in one of the subclones of AE17, which also could be of interest for future studies. Overall, the present data could only be obtained through bioinformatics based on silico analyses, to cope with the microarray data and also for browser based translation of murine into human genome.
-
-
-
Neurogenomic Pathway of Autism Spectrum Disorders: Linking Germline and Somatic Mutations to Genetic-Environmental Interactions
Authors: Svetlana G. Vorsanova, Yuri B. Yurov and Ivan Y. IourovBioinformatic approaches have been extensively applied in genetic and genomic studies of autism spectrum disorders (ASD). However, since this disorder has a distinct albeit complex genetic etiology, these studies have generated data sets requiring additional empirical and theoretical evaluations of molecular and cellular pathway. Additionally, genetic-environmental interactions in ASD are poorly understood. To prioritize genomic variations, molecular/cellular processes and environmental factors, systems biology approaches should be applied. Here, we present a molecular cytogenomic and somatic genomic view on a possible "ASD pathway" to address genetic-environmental interactions in ASD. Taking into account the relevance of these considerations for brain diseases, as a whole, we propose a multi-hit hypothesis to explain the complex nature of interactions between germline mutations, somatic genomic variations (i.e. aneuploidy, genome and chromosome instability) and environment in neuropsychiatric disorders. Using different bioinformatic methods for gene prioritization and analyzing candidate processes for brain dysfunction, it becomes possible to place personalized genomic data in a systems (neuro)biology context. Multidimensional omics data is, therefore, a target for advanced bioinformatics studies, which are able to clarify the biological mechanisms of specific genetic changes in ASD and to enhance the potential for new therapeutic concepts.
-
-
-
Network-Based Classification of Molecular Cytogenetic Data
Authors: Yuri B. Yurov, Svetlana G. Vorsanova and Ivan Y. IourovWith the developments in molecular cytogenetics, it has become evident that correct interpretation of molecular cytogenetic data requires the application of bioinformatics. Furthermore, in silico analysis of chromosome structural and functional variability has been shown to increase the potential of a molecular cytogenetic study. Using systems biology approaches to process data on genome variations or chromosome abnormalities, one can get further insights into molecular and cellular processes in health and disease. A key approach for in silico (bioinformatic) molecular cytogenetics might be the network-based classification of data obtained through uncovering genomic changes at chromosomal (subchromosomal) level. This technology provides interpretation of genomic imbalances by the prioritization of genes and processes involved in the phenotype of a genetic disease. Here, we discuss network-based classification of cytogenetic data in the light of uncovering genetic mechanisms of human diseases in the post-genomic era. Additionally, omics technologies are addressed in the context of chromosome biology. Accordingly, bioinformatic evaluation of genome rearrangements or chromosome imbalances using genome, transcriptome, proteome (intercatome) and metabolome databases is viewed as an important tool for current molecular cytogenetics. Taking into account that bioinformatics has been only recently introduced in molecular cytogenetics, we discuss new opportunities offered by in silico analyses for chromosome biology and medical cytogenetics.
-
-
-
LincRNAs: Systemic Computational Identification and Functional Exploration
Authors: Hanyang Hu, Kitchener D. Wilson, Shan Zhong and Chunjiang HeBackground: The mammalian genome is pervasively transcribed and produces a large number of non-coding products compared to protein-coding genes. One novel sub-class of non-coding RNAs, long intergenic non-coding RNAs (lincRNAs), has been identified in mammalian genomes and is thought to play multiple roles in gene regulation and other cellular processes, and even human disease. Objective: Here, we describe the most up-to-date computational and experimental methods for identifying genome-wide mammalian lincRNAs from multiple high-throughput sequencing data sets, as well as the subsequent large scale functional prediction and verification methods for lincRNA. Furthermore, we discuss several novel approaches that could be useful for lincRNA research in the future. Conclusion: We provide a global view of methods in identifying lincRNAs and procedure for further function research of those lincRNAs.
-
-
-
Structural Key Genes: Differentiating Lung Squamous Cell Carcinomas from Adenocarcinomas
Authors: Yansen Su, Zheng Zhang and Linqiang PanBackground: Adenocarcinoma (AC) and squamous cell carcinoma (SCC) are the two most common subtypes of non-small cell lung carcinoma (NSCLC), and the cures for them are quite different from each other. Traditional morphological procedures could not effectively distinguish AC and SCC because of their morphologically similar cells. Objective: It is necessary to identify the genes which could effectively discriminate AC from SCC on the molecular level. Method: In this work, we apply the context likelihood of related algorithm to gene expression values to infer AC and SCC networks, respectively. We calculate the values of four centrality measures (the average degree, the average clustering coefficient, the average betwenness and the average coritivity) on both AC and SCC networks. The structural key genes are defined as the genes which make great contributions to the topological changes between two gene networks. Results: We find that the values of the average degree and the average coritivity of AC networks are much smaller than those of SCC networks. The degree and the coritivity are considered to be the effective measures to select structural key genes. We obtain 18 structural key genes, five of which have been previously identified as markers to distinguish between AC and SCC. Conclusion: Our results show that the structural key genes which are found by the effective measures may be used to distinguish the subtypes of NSCLC. The current method could be extended to other complex diseases for distinguishing subtypes and detecting the molecular targets for targeted therapy.
-
-
-
Using Quadratic Discriminant Analysis to Predict Protein Secondary Structure Based on Chemical Shifts
Authors: Li Z. Yuan, Feng Yong E, Zhao Wei and Kou G. ShanBackground: Prediction of the protein three-dimensional structure is one of the most important and hot topics in the field of bioinformatics. However, the prediction of the secondary structure of a protein from its amino acid’s sequence is an important step towards the prediction of its three-dimensional structure. Many approaches have been proposed for the prediction of protein secondary structure and yielded better results. However, these algorithms were primarily based on the features of the amino acid sequences. Objective: In this paper, we introduced a new model for predicting the secondary structure of proteins. Method: We used chemical shifts as a novel feature and combined with the quadratic discriminant analysis method in predicting the secondary structure of proteins. Results: Finally, the three-state overall prediction accuracy of 85.7% was obtained in the ten-fold crossvalidated test, and the accuracies of alpha helices, beta stands and coil reached 95.2%, 83.7%, 77.8% respectively. Moreover, to determine the importance of chemical shifts of six nuclei, we used the leave one out feature and combined another five nuclei as features, the results showed that the chemical shift of each nuclei play a different role in the prediction of protein secondary structure, and the maximum overall accuracy reached 87.3% (Q3) in using C Cα Cβ Hα N as features. Conclusion: Our model outperformed other state-of-the-art method in term of predictive accuracy. Our results showed that the quadratic discriminant analysis method by using chemical shifts as features is indeed a good choice for protein secondary structures.
-
-
-
Improved Algorithm for the Location of CPG Islands in Genomic Sequences Using Discrete Wavelet Transforms
Authors: Inbamalar Tharcis Mariapushpam and Sivakumar RajagopalBackground: The genomic sequences can be expressed in terms of alphabets and hence they are discrete in nature. Therefore, digital techniques to analyze genetic problems are in need. Objective: The main aim is to use digital signal processing techniques for the detection of CpG islands. Method: A method to detect the CpG islands using wavelet filtering has been proposed. Modified Electron Ion Interaction Potential mapping has been proposed for numeric conversion. The signal is restricted in frequency through a band pass filter and then wavelet filtered. CpG islands produce bigger magnitude coefficients in the wavelet domain. Results: The proposed method has been tested on genomes of Homosapiens, Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Escherichia coli, Zebrafish, Arabidopsis thaliana and Mus musculus, downloaded from the national center for biotechnology information database. Standard performance metrics have been evaluated and the values obtained are sensitivity - 84.68%, specificity – 85.6%, accuracy – 82.22% and correlation coefficient – 63.31%. Conclusion: On comparing the evaluation metrics obtained to the methods in the literature, it is found that the wavelet transformation method is better. The area under the receiver operating characteristic curve has also been evaluated and is 0.8705 which is larger compared to the methods in literature. Hence, it can be concluded that the proposed method is efficient in detecting CpG islands.
-
-
-
An Effective Method for Identifying Functional Modules in Dynamic PPI Networks
Authors: Jiawei Luo and Chengchen LiuBackground: Identifying functional modules (FM) in Protein-Protein Interaction (PPI) networks is essential for understanding the organization and evolution of cellular systems. Most current functional module discovery algorithms merely focus on the static PPI network. However, PPI network is dynamic over time and varies under different conditions. Objective: Therefore, discovering functional modules in dynamic PPI networks (DPN) is crucial. In this paper, functional module is defined as the union of a time-line of evolutionary step-modules. A novel StableCore and Adaptive Incremental Algorithm (SCAIA) is developed to discover functional modules in DPN. Method: The SCAIA first detects static step-modules of the first subnetwork and adaptively updates the modular structure of other subnetworks, and then identifies functional modules and their evolutionary trends based on the extracted step-modules of each subnetwork. Results: Extensive results show SCAIA achieves very satisfactory Precision, F-measure and Pvalue results among the seven functional module discovery algorithms compared in this study. Conclusion: SCAIA performs significantly better than seven methods on discovering accurate and stable functional modules. SCAIA can also track the evolutionary process of functional modules over time, providing insights into the underlying behavior of functional modules for future biological studies.
-
-
-
GPCRTOP v.1.0: One-Step Web Server for Both Predicting Helical Transmembrane Segments and Identifying G Protein-Coupled Receptors
Authors: Babak Sokouti, Farshad Rezvan and Siavoush DastmalchiBackground: G protein-coupled receptors (GPCRs) are a large superfamily of membrane proteins and because of the difficulties in experimentally determining their structures, computational approaches are essential. Objective: GPCRTOP v.1.0 is an HMM-based web server which has been developed for predicting helical transmembrane (TM) segments and identifying GPCRs based on amino acid distribution patterns. The performance of the method was evaluated in comparison to other general TM prediction methodologies. Methods: 49093 unannotated human protein sequences were retrieved from TrEMBL-SwissProt. The InterPro database was used for finding the GPCR sequences in common with those predicted by GPCRTOP v.1.0. For those which were not in common, ten well-known TM predictors were utilized to analyse these sequences. Results: The results showed that 199 sequences were predicted as GPCRs by GPCRTOP v.1.0 whereas, there were 182 GPCR sequences in InterPro database. Among these sequences, 104 sequences were identified as GPCR by both GPCRTOP v.1.0 and InterPro database. The remaining sequences were then predicted by general TM predictors and their results showed 11.1% more agreement to that of GPCRTOP v.1.0 than InterPro database. Conclusion: GPCRTOP v.1.0 is useful for identifying GPCRs and determining their topologies with overall accuracy of ~99%. Here, we also announce the web availability of GPCRTOP v.1.0 (http://gpcrtop.tbzmed.ac.ir/services.aspx) and also describe its prediction features, which include protein type (i.e., GPCR or non-GPCR), number of TM segments, as well as the topology of the predicted GPCR.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
