Current Bioinformatics - Volume 7, Issue 3, 2012
Volume 7, Issue 3, 2012
-
-
Towards an Ontology to Support Semantics Enabled Diagnostic Decision Support Systems
Healthcare has played a main role in the Semantic Web (SW) field given the knowledge representation possibilities that SW is capable of addressing. Nowadays there are a large number of ontologies which can be used for several domains of healthcare (genetics, proteins, cellular components, anatomy, and specific diseases among others). However, in some cases, the definition and population of these ontologies are not enough to be used in concrete domains. In this paper we provide the design of a set of ontologies for their direct use in diagnostic decision support systems. We have designed an ontology modular architecture where main (root) ontology is created to define the main relations which can be found in the aforementioned domain. A set of subsumed ontologies has also been designed following some principles of OBO-Foundry and using SNOMED-CT terminology as the main interoperability component. These ontologies have been also designed trying to create them as light as possible. The evaluation of the designed ontology is based on a set of quantitative aspects which aims to show the main principles which should be followed in the process of design ontologies for the domain of differential diagnosis.
-
-
-
Towards a Metadata Model for Mass-Spectrometry Based Clinical Proteomics
Authors: John Springer, Fan Zhang, Peter Hussey, Charles Buck, Fred Regnier and Jake ChenRecent proteomics studies of clinical samples have generated substantial interest. Aided by advances in analytical chemistry and bioinformatics, clinical proteomics has become a driving force behind molecular biomarker development. However, it is still difficult to manage and interpret large amounts of clinical proteomics data due to data integration challenges. The lack of practical metadata representation standards has prevented sharing and interpretation of mass spectrometry experimental results derived from different experimental conditions or different proteomics labs, and ultimately this absence has resulted in missed opportunities for proteomic biomarker discovery. Therefore, in this paper, we describe methods for deploying Semantic Web technologies to design an ontology using OWL for clinical proteomics information and to manage such information using various mechanisms, such as CPAS. We developed a practical proteomics experimental metadata model using Semantic Web technologies and demonstrated the manner in which this model can be integrated with current proteomics data analysis software systems. We demonstrated the manner in which systems employing the metadata model can begin to enable inter-laboratory sharing and analysis of clinical proteomics data. We also discussed the manner in which these tools and techniques have aided in proteomic biomarker discovery studies. Our work reflects an approach to adopt a Cancer Biomedical Informatics Grid (caBIG) compliant software system through the use of an ontology-based metadata model. This effort is the first step in a bigger initiative to move toward an ontology-based approach that enables a standards-driven approach to large-scale inter-laboratory proteomics data integration and analyses with the overarching goal of the discovery of proteomic biomarkers.
-
-
-
Publishing Orthology and Diseases Information in the Linked Open Data Cloud
The Linked Data initiative offers a straight method to publish structured data in the World Wide Web and link it to other data, resulting in a world wide network of semantically codified data known as the Linked Open Data cloud. The size of the Linked Open Data cloud, i.e. the amount of data published using Linked Data principles, is growing exponentially, including life sciences data. However, key information for biological research is still missing in the Linked Open Data cloud. For example, the relation between orthologs genes and genetic diseases is absent, even though such information can be used for hypothesis generation regarding human diseases. The OGOLOD system, an extension of the OGO Knowledge Base, publishes orthologs/diseases information using Linked Data. This gives the scientists the ability to query the structured information in connection with other Linked Data and to discover new information related to orthologs and human diseases in the cloud.
-
-
-
SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting
Authors: Christopher D. Pierce, David Booth, Chimezie Ogbuji, Chris Deaton, Eugene Blackstone and Doug LenatSemantic Web technologies offer the potential to revolutionize management of health care data by increasing interoperability and reusability while reducing the need for redundant data collection and storage. From 1998 through 2010, Cleveland Clinic sponsored a project designed to explore and develop this potential. The product of this effort, SemanticDB, is a suite of software tools and knowledge resources built to facilitate the collection, storage and use of the diverse data needed to conduct clinical research and health care quality reporting. SemanticDB consists of three main components: 1) a content repository driven by a meta-model that facilitates collection and integration of data in an XML format and automatically converts the data to RDF; 2) an inference-mediated, natural language query interface designed to identify patients who meet complex inclusion and exclusion criteria; and 3) a data production pipeline that uses inference to generate customized views of the repository content for statistical analysis and reporting. Since 2008, this system has been used by the Cleveland Clinic's Heart and Vascular Institute to support numerous clinical investigations, and in 2009 Cleveland Clinic was certified to submit data produced in this manner to national quality monitoring databases sponsored by the Society of Thoracic Surgeons and the American College of Cardiology.
-
-
-
DartWiki: A Semantic Wiki for Ontology-Based Knowledge Integration in the Biomedical Domain
Authors: Tong Yu, Huajun Chen, Jinhua Mi, Peiqin Gu, Ting Wu and Jeff Z. PanSemantic Web languages and technologies can be used for the annotation, classification, and organization of knowledge assets and digital artifacts based on biomedical ontologies. In this paper, we present a semantic wiki, named DartWiki, to build ontology-based digital encyclopedia for the biomedicine domain. DartWiki provides a Web-based interface for accessing knowledge artifacts in both per-artifact and per-concept mode. In the per-artifact mode, users can access these artifacts, and annotate them in both short texts and logical statements in terms of domain ontologies. In the concept-based mode, users can navigate a graph of concepts, and review and edit the synthesized page about a selected concept, which contains meaningful information about the concept, and also its related concepts and artifacts. Smooth transitions between the two modes are achieved through semantic links. As a use case of the DartWiki, we provide an open platform for the management and maintenance of digital artifacts in Integrated Medicine. This system provides medical practitioners with relevant and trustworthy knowledge artifacts, and also means to input artifacts, to clarify their meaning, and to check and improve their quality, which encourages the inclusion and participation of users, and effectively creates an online community around knowledge sharing.
-
-
-
A Filter Based Feature Selection Algorithm Using Null Space of Covariance Matrix for DNA Microarray Gene Expression Data
Authors: Alok Sharma, Seiya Imoto and Satoru MiyanoWe propose a new filter based feature selection algorithm for classification based on DNA microarray gene expression data. It utilizes null space of covariance matrix for feature selection. The algorithm can perform bulk reduction of features (genes) while maintaining the quality information in the reduced subset of features for discriminative purpose. Thus, it can be used as a pre-processing step for other feature selection algorithms. The algorithm does not assume statistical independency among the features. The algorithm shows promising classification accuracy when compared with other existing techniques on several DNA microarray gene expression datasets.
-
-
-
A Novel Method of Sequence Similarity Evaluation in N-dimensional Sequence Space
Authors: Andrzej Kasperski and Renata KasperskaThe aim of this work is to establish a universal method of searching for similarities between sequences in an ndimensional sequence space. The presented idea extends out of the original Dot-Matrix and semihomology methods with a possibility of making analyses in an n-dimensional sequence space and indicates the method of similarity evaluation. The main novelty of the implemented dotPicker program is to allow for searches of similarities in an n-dimensional sequence space. Sets of identity fragments, which represent given protein families, have been obtained using this program. The idea of evaluation of the obtained identity fragments is proposed and its utilization is presented. Moreover, the potential of the dotPicker program is shown especially when analyzing and identifying previously unknown similarities in protein families.
-
-
-
Advantages of a Pareto-Based Genetic Algorithm to Solve the Gene Synthetic Design Problem
Authors: Paulo Gaspar and Jose Luis OliveiraCodon usage, codon context, rare codons, nucleotide repetition and mRNA destabilizing sequences are but a few of the many factors that influence the efficiency of protein synthesis. Therefore, gene redesign for heterologous expression is a multi-objective optimization problem and the factors that need to be considered are often conflicting. Evolutionary approaches have already been shown to be able to evolve a sequence under the forces of specific constraints. However, it is unclear what are the advantages of a slower algorithm such as GA when compared with other faster algorithms in the gene redesign context. Here, a solution using genetic algorithms along with a Pareto archive is used for the gene synthetic redesign problem. The different redesign parameters are merged using an adapted genetic algorithm strategy. From the created model, the best possible synonymous gene sequence is generated. This allows tackling the gene redesign problem by exploring the large search space of possible synonymous sequences. It is then shown that genetic algorithms have several advantages over other heuristics in the gene redesign problem. For instance, the ability to return the best solutions constituting the main part of the Pareto front, even in non-convex or non-continuous spaces. This allows a researcher to select synonymous genes among the optimal solutions, to best suit his purpose, instead of accepting a single solution that might represent an unwanted trade-off between the objectives.
-
-
-
Overview of microRNA Target Analysis Tools
Authors: Panteleimon Zotos, Maria G. Roubelakis, Nicholas P. Anagnou and Sophia KossidamicroRNA (miRNA) target prediction plays an important role in studying the post-transcriptional regulation by miRNAs. Numerous target prediction tools have been utilized in several research approaches for in silico analysis.This type of analysis provides the initial step for further experimental validation in biological systems in order to complete the target validation.In this review, we summarize the computational tools based on a single method for miRNA target prediction (single algorithm tools) and also the comparative target prediction tools, utilized in several research approaches for in silico analysis. Comparative target prediction tools have provided a novel methodology for miRNA target prediction by reducing the number of false positives. Such tools are combining results from multiple single algorithm tools, facilitating, in this way, the reduction of the false positive results and providing a more accurate prediction. The main goal of this review is to summarize the available literature on the miRNA target prediction algorithms and tools in an extensive manner.
-
-
-
Genome-Wide Analysis of AP2/ERF Transcription Factor Family in Zea Mays
Authors: Mei-Liang Zhou, Yi-Xiong Tang and Yan-Min WuMaize (Zea mays ssp. mays L.) is an important crop as well as an important model organism for fundamental research into the inheritance and functions of genes, the mechanistic relation between cytological crossovers and recombination, and the origin of the nucleolus. AR2/ERF is a large family of transcription factors in plant, encoding transcriptional regulators with a variety of functions involved in the developmental and physiological processes. Here, starting from database of Zea mays, we identified 292 AP2/ERF genes by in silico cloning method using the AP2/ERF conserved domain amino acid sequence of Arabidopsis thaliana as probe. Based on the number of AP2/ERF domains and the function of the genes, those AP2/ERF genes from maize were classified into four subfamilies named the AP2, DREB, ERF and RAV. The genome distribution of maize AP2/ERF genes strongly supports the hypothesis that genome-wide contributed to the expansion of the AP2/ERF gene family. Bioinformatics analysis suggests that maize AP2/ERF proteins can potentially participate in a variety of stress responses, endowing them with the capacity to regulate a multitude of transcriptional programs. In addition, similar expression patterns suggest functional conservation between some maize AP2/ERF gesnes and their close Arabidopsis homologs.
-
-
-
Challenges from Clustering Analysis to Knowledge Discovery in Molecular Biomechanics
By Loh Wei PingThroughout endless experimental work, short records of dynamic molecular data are generated from time to time. Biomechanics data mining and knowledge discovery have become an important study area to turn the abundance of generated raw data into pieces of information. In data mining, researchers often encounter challenging issues and constraints, ranging from nature of the collected microarray data and developed clustering algorithms to informative discovery for rhythmic data decision-making processes. This article presents the review of the commonly practiced clustering techniques in molecular biomechanical systems towards better applications in bioengineering research. It highlights the constraints and challenges encountered in temporal molecular bioengineering mechanisms. The findings revealed that the molecular data are commonly analyzed based on data mining computation and mathematical applications to link both developmental stages interfaces and the mechanical principles of living organisms. In this area, mathematical analyses are extensively carried out to investigate dynamic microarray using clustering techniques. The main goal is to extract informative knowledge. Therefore, in order to derive collective patterns and reliable information from microarray, there is a need to consider effects from the nature of data, clustering algorithms and knowledge discovery processes which require substantial understanding on biological systems.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
