Volume 7, Issue 3

Current Bioinformatics - Volume 7, Issue 3, 2012

Volume 7, Issue 3, 2012

- Editorial: [Hot Topic: Semantic Web for Current Healthcare and Bioinformatics]
  
  Authors: Huajun Chen and Guotong Xie
  
  https://doi.org/10.2174/157489312802460767
  More Less
  
  Add to my favourites
  
  Email this

- Towards an Ontology to Support Semantics Enabled Diagnostic Decision Support Systems
  
  Authors: Alejandro Rodriguez-Gonzalez, Gandhi Hernandez-Chan, Ricardo Colomo-Palacios, Juan Miguel Gomez-Berbis, Angel Garcia-Crespo, Giner Alor-Hernandez and Rafael Valencia-Garcia
  
  https://doi.org/10.2174/157489312802460721
  More Less
  
  Healthcare has played a main role in the Semantic Web (SW) field given the knowledge representation possibilities that SW is capable of addressing. Nowadays there are a large number of ontologies which can be used for several domains of healthcare (genetics, proteins, cellular components, anatomy, and specific diseases among others). However, in some cases, the definition and population of these ontologies are not enough to be used in concrete domains. In this paper we provide the design of a set of ontologies for their direct use in diagnostic decision support systems. We have designed an ontology modular architecture where main (root) ontology is created to define the main relations which can be found in the aforementioned domain. A set of subsumed ontologies has also been designed following some principles of OBO-Foundry and using SNOMED-CT terminology as the main interoperability component. These ontologies have been also designed trying to create them as light as possible. The evaluation of the designed ontology is based on a set of quantitative aspects which aims to show the main principles which should be followed in the process of design ontologies for the domain of differential diagnosis.
  
  Add to my favourites
  
  Email this

- Towards a Metadata Model for Mass-Spectrometry Based Clinical Proteomics
  
  Authors: John Springer, Fan Zhang, Peter Hussey, Charles Buck, Fred Regnier and Jake Chen
  
  https://doi.org/10.2174/157489312802460785
  More Less
  
  Recent proteomics studies of clinical samples have generated substantial interest. Aided by advances in analytical chemistry and bioinformatics, clinical proteomics has become a driving force behind molecular biomarker development. However, it is still difficult to manage and interpret large amounts of clinical proteomics data due to data integration challenges. The lack of practical metadata representation standards has prevented sharing and interpretation of mass spectrometry experimental results derived from different experimental conditions or different proteomics labs, and ultimately this absence has resulted in missed opportunities for proteomic biomarker discovery. Therefore, in this paper, we describe methods for deploying Semantic Web technologies to design an ontology using OWL for clinical proteomics information and to manage such information using various mechanisms, such as CPAS. We developed a practical proteomics experimental metadata model using Semantic Web technologies and demonstrated the manner in which this model can be integrated with current proteomics data analysis software systems. We demonstrated the manner in which systems employing the metadata model can begin to enable inter-laboratory sharing and analysis of clinical proteomics data. We also discussed the manner in which these tools and techniques have aided in proteomic biomarker discovery studies. Our work reflects an approach to adopt a Cancer Biomedical Informatics Grid (caBIG) compliant software system through the use of an ontology-based metadata model. This effort is the first step in a bigger initiative to move toward an ontology-based approach that enables a standards-driven approach to large-scale inter-laboratory proteomics data integration and analyses with the overarching goal of the discovery of proteomic biomarkers.
  
  Add to my favourites
  
  Email this

- Publishing Orthology and Diseases Information in the Linked Open Data Cloud
  
  Authors: Jose A. Minarro-Gimenez, Mikel Egana-Aranguren, Boris Villazon-Terrazas and Jesualdo T. Fernandez-Breis
  
  https://doi.org/10.2174/157489312802460811
  More Less
  
  The Linked Data initiative offers a straight method to publish structured data in the World Wide Web and link it to other data, resulting in a world wide network of semantically codified data known as the Linked Open Data cloud. The size of the Linked Open Data cloud, i.e. the amount of data published using Linked Data principles, is growing exponentially, including life sciences data. However, key information for biological research is still missing in the Linked Open Data cloud. For example, the relation between orthologs genes and genetic diseases is absent, even though such information can be used for hypothesis generation regarding human diseases. The OGOLOD system, an extension of the OGO Knowledge Base, publishes orthologs/diseases information using Linked Data. This gives the scientists the ability to query the structured information in connection with other Linked Data and to discover new information related to orthologs and human diseases in the cloud.
  
  Add to my favourites
  
  Email this

- SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting
  
  Authors: Christopher D. Pierce, David Booth, Chimezie Ogbuji, Chris Deaton, Eugene Blackstone and Doug Lenat
  
  https://doi.org/10.2174/157489312802460730
  More Less
  
  Semantic Web technologies offer the potential to revolutionize management of health care data by increasing interoperability and reusability while reducing the need for redundant data collection and storage. From 1998 through 2010, Cleveland Clinic sponsored a project designed to explore and develop this potential. The product of this effort, SemanticDB, is a suite of software tools and knowledge resources built to facilitate the collection, storage and use of the diverse data needed to conduct clinical research and health care quality reporting. SemanticDB consists of three main components: 1) a content repository driven by a meta-model that facilitates collection and integration of data in an XML format and automatically converts the data to RDF; 2) an inference-mediated, natural language query interface designed to identify patients who meet complex inclusion and exclusion criteria; and 3) a data production pipeline that uses inference to generate customized views of the repository content for statistical analysis and reporting. Since 2008, this system has been used by the Cleveland Clinic's Heart and Vascular Institute to support numerous clinical investigations, and in 2009 Cleveland Clinic was certified to submit data produced in this manner to national quality monitoring databases sponsored by the Society of Thoracic Surgeons and the American College of Cardiology.
  
  Add to my favourites
  
  Email this

- DartWiki: A Semantic Wiki for Ontology-Based Knowledge Integration in the Biomedical Domain
  
  Authors: Tong Yu, Huajun Chen, Jinhua Mi, Peiqin Gu, Ting Wu and Jeff Z. Pan
  
  https://doi.org/10.2174/157489312802460758
  More Less
  
  Semantic Web languages and technologies can be used for the annotation, classification, and organization of knowledge assets and digital artifacts based on biomedical ontologies. In this paper, we present a semantic wiki, named DartWiki, to build ontology-based digital encyclopedia for the biomedicine domain. DartWiki provides a Web-based interface for accessing knowledge artifacts in both per-artifact and per-concept mode. In the per-artifact mode, users can access these artifacts, and annotate them in both short texts and logical statements in terms of domain ontologies. In the concept-based mode, users can navigate a graph of concepts, and review and edit the synthesized page about a selected concept, which contains meaningful information about the concept, and also its related concepts and artifacts. Smooth transitions between the two modes are achieved through semantic links. As a use case of the DartWiki, we provide an open platform for the management and maintenance of digital artifacts in Integrated Medicine. This system provides medical practitioners with relevant and trustworthy knowledge artifacts, and also means to input artifacts, to clarify their meaning, and to check and improve their quality, which encourages the inclusion and participation of users, and effectively creates an online community around knowledge sharing.
  
  Add to my favourites
  
  Email this

- A Filter Based Feature Selection Algorithm Using Null Space of Covariance Matrix for DNA Microarray Gene Expression Data
  
  Authors: Alok Sharma, Seiya Imoto and Satoru Miyano
  
  https://doi.org/10.2174/157489312802460802
  More Less
  
  We propose a new filter based feature selection algorithm for classification based on DNA microarray gene expression data. It utilizes null space of covariance matrix for feature selection. The algorithm can perform bulk reduction of features (genes) while maintaining the quality information in the reduced subset of features for discriminative purpose. Thus, it can be used as a pre-processing step for other feature selection algorithms. The algorithm does not assume statistical independency among the features. The algorithm shows promising classification accuracy when compared with other existing techniques on several DNA microarray gene expression datasets.
  
  Add to my favourites
  
  Email this

- A Novel Method of Sequence Similarity Evaluation in N-dimensional Sequence Space
  
  Authors: Andrzej Kasperski and Renata Kasperska
  
  https://doi.org/10.2174/157489312802460749
  More Less
  
  The aim of this work is to establish a universal method of searching for similarities between sequences in an ndimensional sequence space. The presented idea extends out of the original Dot-Matrix and semihomology methods with a possibility of making analyses in an n-dimensional sequence space and indicates the method of similarity evaluation. The main novelty of the implemented dotPicker program is to allow for searches of similarities in an n-dimensional sequence space. Sets of identity fragments, which represent given protein families, have been obtained using this program. The idea of evaluation of the obtained identity fragments is proposed and its utilization is presented. Moreover, the potential of the dotPicker program is shown especially when analyzing and identifying previously unknown similarities in protein families.
  
  Add to my favourites
  
  Email this

- Advantages of a Pareto-Based Genetic Algorithm to Solve the Gene Synthetic Design Problem
  
  Authors: Paulo Gaspar and Jose Luis Oliveira
  
  https://doi.org/10.2174/157489312802460712
  More Less
  
  Codon usage, codon context, rare codons, nucleotide repetition and mRNA destabilizing sequences are but a few of the many factors that influence the efficiency of protein synthesis. Therefore, gene redesign for heterologous expression is a multi-objective optimization problem and the factors that need to be considered are often conflicting. Evolutionary approaches have already been shown to be able to evolve a sequence under the forces of specific constraints. However, it is unclear what are the advantages of a slower algorithm such as GA when compared with other faster algorithms in the gene redesign context. Here, a solution using genetic algorithms along with a Pareto archive is used for the gene synthetic redesign problem. The different redesign parameters are merged using an adapted genetic algorithm strategy. From the created model, the best possible synonymous gene sequence is generated. This allows tackling the gene redesign problem by exploring the large search space of possible synonymous sequences. It is then shown that genetic algorithms have several advantages over other heuristics in the gene redesign problem. For instance, the ability to return the best solutions constituting the main part of the Pareto front, even in non-convex or non-continuous spaces. This allows a researcher to select synonymous genes among the optimal solutions, to best suit his purpose, instead of accepting a single solution that might represent an unwanted trade-off between the objectives.
  
  Add to my favourites
  
  Email this

- Overview of microRNA Target Analysis Tools
  
  Authors: Panteleimon Zotos, Maria G. Roubelakis, Nicholas P. Anagnou and Sophia Kossida
  
  https://doi.org/10.2174/157489312802460820
  More Less
  
  microRNA (miRNA) target prediction plays an important role in studying the post-transcriptional regulation by miRNAs. Numerous target prediction tools have been utilized in several research approaches for in silico analysis.This type of analysis provides the initial step for further experimental validation in biological systems in order to complete the target validation.In this review, we summarize the computational tools based on a single method for miRNA target prediction (single algorithm tools) and also the comparative target prediction tools, utilized in several research approaches for in silico analysis. Comparative target prediction tools have provided a novel methodology for miRNA target prediction by reducing the number of false positives. Such tools are combining results from multiple single algorithm tools, facilitating, in this way, the reduction of the false positive results and providing a more accurate prediction. The main goal of this review is to summarize the available literature on the miRNA target prediction algorithms and tools in an extensive manner.
  
  Add to my favourites
  
  Email this

- Genome-Wide Analysis of AP2/ERF Transcription Factor Family in Zea Mays
  
  Authors: Mei-Liang Zhou, Yi-Xiong Tang and Yan-Min Wu
  
  https://doi.org/10.2174/157489312802460776
  More Less
  
  Maize (Zea mays ssp. mays L.) is an important crop as well as an important model organism for fundamental research into the inheritance and functions of genes, the mechanistic relation between cytological crossovers and recombination, and the origin of the nucleolus. AR2/ERF is a large family of transcription factors in plant, encoding transcriptional regulators with a variety of functions involved in the developmental and physiological processes. Here, starting from database of Zea mays, we identified 292 AP2/ERF genes by in silico cloning method using the AP2/ERF conserved domain amino acid sequence of Arabidopsis thaliana as probe. Based on the number of AP2/ERF domains and the function of the genes, those AP2/ERF genes from maize were classified into four subfamilies named the AP2, DREB, ERF and RAV. The genome distribution of maize AP2/ERF genes strongly supports the hypothesis that genome-wide contributed to the expansion of the AP2/ERF gene family. Bioinformatics analysis suggests that maize AP2/ERF proteins can potentially participate in a variety of stress responses, endowing them with the capacity to regulate a multitude of transcriptional programs. In addition, similar expression patterns suggest functional conservation between some maize AP2/ERF gesnes and their close Arabidopsis homologs.
  
  Add to my favourites
  
  Email this

- Challenges from Clustering Analysis to Knowledge Discovery in Molecular Biomechanics
  
  By Loh Wei Ping
  
  https://doi.org/10.2174/157489312802460794
  More Less
  
  Throughout endless experimental work, short records of dynamic molecular data are generated from time to time. Biomechanics data mining and knowledge discovery have become an important study area to turn the abundance of generated raw data into pieces of information. In data mining, researchers often encounter challenging issues and constraints, ranging from nature of the collected microarray data and developed clustering algorithms to informative discovery for rhythmic data decision-making processes. This article presents the review of the commonly practiced clustering techniques in molecular biomechanical systems towards better applications in bioengineering research. It highlights the constraints and challenges encountered in temporal molecular bioengineering mechanisms. The findings revealed that the molecular data are commonly analyzed based on data mining computation and mathematical applications to link both developmental stages interfaces and the mechanical principles of living organisms. In this area, mathematical analyses are extensively carried out to investigate dynamic microarray using clustering techniques. The main goal is to extract informative knowledge. Therefore, in order to derive collective patterns and reliable information from microarray, there is a need to consider effects from the nature of data, clustering algorithms and knowledge discovery processes which require substantial understanding on biological systems.
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 7, Issue 3, 2012

Volume 7, Issue 3, 2012

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed