Current Bioinformatics - Volume 13, Issue 3, 2018
Volume 13, Issue 3, 2018
-
-
Gene Selection Using High Dimensional Gene Expression Data: An Appraisal
Authors: Abhishek Bhola and Shailendra SinghMicroarray technology allows us to study the gene expression levels of thousands of genes over different experimental conditions in a single go. The gene expression data provided by microarray technology is of enormous size i.e. high dimensional which makes the downstream analysis a very challenging task. The gene selection is an essential process which removes the problem of dimensionality by removing irrelevant and unwanted genes from gene expression data. A variety of gene selection techniques are available in the literature which are used widely to find the most informative and significant genes from the given gene expression dataset. This paper reviews different aspects of gene selection and other research issues which came across while analyzing gene expression data. The article provides a brief overview of the gene selection for high dimensionality reduction in gene expression data.
-
-
-
A Joint Probabilistic Model in DNA Sequences
By Huili LiuBackground: Most existing methods for comparing and analyzing DNA sequences use multiple sequence alignment (MSA) algorithms. However, the computation time required for MSA is usually very long and makes it impossible to analyze a large group of long DNA sequences. Objective: Here we propose a novel computational method to quickly characterize and compare DNA sequences. Method: We construct a new 2-dimensional (2D) graphical representation of DNA sequences based on the mathematical concept of joint probability. A dinucleotide is assigned by the product of the signed probability of the two nucleotides, which is totally independent of the choice of the species studied. Results: We perform similarity/dissimilarity analyses among three real DNA data sets, the first exon of the beta-globin gene of eleven animal species, ribulose bisphosphate carboxylase small chain (rbcS) gene of eleven species of flowering plants, and mitochondrial genome sequences of eleven mammal species, respectively. Conclusion: Our results coincide with existing biological analyses in the literature. We also compare our approach with MSA algorithm, which is much quicker and more effective.
-
-
-
An Error Correction and DeNovo Assembly Approach for Nanopore Reads Using Short Reads
Authors: Mehdi Kchouk and Mourad ElloumiBackground: Error Correction is an important task in the analysis and manipulations of NGS data. The purpose of error correction is to facilitate data analysis for large projects like de novo assembly project. Here we present a new hybrid algorithm for error correction of long reads using short reads. Our algorithm can be flexibly adapted to different types of errors. Next, we make a de novo assembly for corrected long reads. Objective: We present MiRCA (MinIon Reads Correction Algorithm) a hybrid approach based on the sequences alignments that detects and corrects errors for MinIon long reads using Illumina short reads. Methods: In our approach, we operate in four steps. First, we make a Quality Control and Cleaning data. Second, we use the contig forming for the Pre-Error Correction Step. Third, we use the alignment to align pre-assembled contig to long reads and we use this alignment to correction erroneous long reads. Finally, we do an assembly for the corrected long reads. Results: The results of mapping of S.cerevisaeW303 and E.coli genomes shows that our error correction approach produce a high quality long reads with mapping rate ~99% to the reference genome in reasonable time. For denovo assembly, the corrected long reads gives good assembly in a short running time compared to other error correction tools. Conclusion: MiRCA is a new hybrid approach that detects and corrects errors. It uses an alignmentbased approach using pre-assembled short reads as a reference to correct nanopore long reads. The experimental evaluation of the corrected long reads on the reference genome of S. cerevisae and E.coli shows that MiRCA ensures best error correction compared to existing related works.
-
-
-
Drug and Nondrug Classification Based on Deep Learning with Various Feature Selection Strategies
Authors: Long Yu, Xia Sun, Shengwei Tian, Xinyu Shi and Yilin YanBackground: In the past decades, a number of methods are proposed for dealing with the classification of molecular data. Supervised ML methods such as linear discriminant analysis and decision trees were used to predict structural properties of molecules. Furthermore, logistic regression, Bayesian networks and artificial neural networks have been used to distinguish drugs and non-drugs. However, most of them can not hierarchically extract deep features. Objective: The feature extracted by the SAEs based model is useful for classification of molecules. Method: In this study, the model is a mix of deep learning architecture and softmax classifier. Firstly, the molecular data was preprocessed by the feature selection strategies. Secondly, the applicability of stacked auto-encoders was verified by information-based molecular classification. Then, another method of classifying based on multi-dimensional features was proposed. Finally, we proposed a new deep learning model, from which a higher classification accuracy could be gained. Results: The deep learning model AE mentioned above which is used to classify the data of molecule, and SAEs as the corresponding deep architecture have been practiced. Therefore, we combined the SAEs and softmax by taking the output of the last SAE as the input of softmax. That is, classifying drug and nondrug by using outstanding features can be learned from SAEs. Conclusion: Experimental results show that the performance of classifiers in this deep learning-based model is competitive. In addition, the proposition of joint multi-dimensional deep neural network is a breakthrough for future research. Also it presents the potential of deep learning-based methods on accurate drug and nondrug classification.
-
-
-
In-silico Evidences of Regulatory Roles of WT1 Transcription Factor Binding Sites on the Intervening Sequences of the Human Bcl-2 Gene
Background: Intervening sequences (introns) have significant effects on genomic regulations and molecular evolution. So, it deserves a deeper analysis for better understanding the possible regulatory roles of these regions. Objective and Method: Accordingly, the intron 2 (In-2) of the human B-cell lymphoma 2 (hBcl2) gene, with regard to the size of the In-2 as well as critical roles of the gene in the homeostatic of the cellular balance, was analyzed by using in-silico approaches to identify In-2 transcription factor binding (In2-TFBs) motifs. Results: Our analysis revealed 966 motifs of 118 different TFBs types which were scattered throughout both the strands of the complete sequence of the gene, in particular on the In-2, with significant pattern of distribution and repetition. Distribution pattern of these motifs revealed that most of them were accumulated in narrow regions of the In-2, far from the area of the splicing sites. Moreover, it was observed that except for WT1-TFBs, Gfi-1-TFBs, GAGA-TFBs, all other motifs were sporadic, with irregular and random distribution. Among these motifs, WT1-TFBs showed the highest frequencies which were situated in four neighboring regions of the In-2, by a close linear relationship to Sp1-TFBs. Furthermore, the sequence logos of the WT1-TFBs showed that they ranged in size from 22 up to 45 bps and were enriched with G and T nucleotides. Meanwhile, the binding affinity of WT1-TF to WT1- TFBs revealed significant differences compared to the other sequences of the gene as negative control. Conclusion: In general, this data provides supporting evidences for the existence of regulatory regions in the intronic sequences of the hBcl2 gene especially in the In-2, and also represents new targets for WT1-TF which might contribute to hBcl2 regulation and apoptosis process.
-
-
-
A Therapeutic Paradigm to Appraise the Competence of Chitosan Oligosaccharide Lactate Targeting Monoamine Oxidase-A and P-Glycoprotein to Contest Depression by Channeling the Blood Brain Barrier
Authors: A. A. Margret and Ganesh Kumar ArumugamBackground: Depression is a serious mental ailment that is considered to be a global threat with an increased risk of prevalence. Drug development targeted against depression is an immense challenge as it encompasses multifaceted reasons with a poorly understood pathology along with the impediment of physiological blood brain barrier to the targeted site. Monoamine inhibitors are attainable as antidepressants since it has pharmacological focus by considering monoamine oxidase-A (MAO-A) as potential drug targets against neurological ailments. Consequently, there is a sturdy constraint to formulate a brain drug that can decimate depression and surpass the physiological impediment to convalesce mental health. Objective: This study furnishes both in silico and in vitro analysis which intends the potentials of chitosan oligosaccharide lactate as an efficient monoamine-A inhibitor that can restrain the efflux transporter permeability glycoprotein (P-gp). Method: The activity of chitosan oligosaccharide lactate is evaluated by molecular docking assay which was substantiated by a cell line study. Results: The molecular docking assay against both the targeted proteins (MAO-A, P-gp) furnished a minimum binding affinity energy value of-9.341 and-7.326 kcal/mol. The activity of P-glycoprotein is evaluated by cell line studies which corroborated the inhibitory potential of chitosan oligosaccharide lactate by an increased potential of rhodamine transport assay when compared to the control. Conclusion: Chitosan oligosaccharides are derived from chemical hydrolysis of chitosan that ascertain itself as an efficient drug carrier with a significant ADMET score of low solubility and avoids the untargeted discharge of drug. The study bestows a two fold therapeutic efficacy of the polymer to establish itself as a proficient antidepressant that can channel across the blood brain barrier.
-
-
-
Prediction of New Bacterial Type III Secreted Effectors with a Recursive Hidden Markov Model Profile-Alignment Strategy
Authors: Zhirong Guo, Xi Cheng, Xinjie Hui, Xingsheng Shu, Aaron P. White, Yueming Hu and Yejun WangBackground: To identify new bacterial type III secreted effectors is computationally a big challenge. At least a dozen machine learning algorithms have been developed, but so far have only achieved limited success. Sequence similarity appears important for biologists but is frequently neglected by algorithm developers for effector prediction, although large success was achieved in the field with this strategy a decade ago. Objective: The study aimed to develop a sequence similarity based effector prediction tool. Method: In this study, we propose a recursive sequence alignment strategy with Hidden Markov Models, to comprehensively find homologs of known YopJ/P full-length proteins, effector domains and N-terminal signal sequences. Results: Using this method, we identified 155 different YopJ/P-family effectors and 59 proteins with YopJ/P N-terminal signal sequences from 27 genera and more than 70 species. Among these genera, we also identified one type III secretion system (T3SS) from Uliginosibacterium and two T3SSs from Rhizobacter for the first time. Higher conservation of effector domains, N-terminal fusion of signal sequences to other effectors, and the exchange of N-terminal signal sequences between different effector proteins were frequently observed for YopJ/P-family proteins. This made it feasible to identify new effectors based on separate similarity screening for the N-terminal signal peptides and the effector domains of known effectors. This method can also be applied to search for homologues of other known T3SS effectors. Conclusion: A new sequence alignment based method was developed, which could effectively facilitate the identification of new T3SS effectors.
-
-
-
SARELI: Sequence Alignment by Radial Evaluation of Local Interactions
Authors: Ricardo Ortega, Arturo Chavoya, Cuauhtemoc Lopez-Martin and Luis DelayeBackground: A robust guide tree is necessary as a first step for the multiple sequence alignment of proteins. The guide tree is normally generated using an initial distance matrix based on a particular distance metric. Objective: A new tool for generating guide trees for multiple protein sequence alignment is presented. Method: The algorithm involved in the initialization of the progressive algorithm for the alignment of sequences is computed by a novel metric termed Radial Distance that estimates the variation around symbols in two sequences; after the initial distance matrix is generated, a guide tree is created using the neighbor joining algorithm. The guide trees generated with our tool were then fed independently into MUSCLE and Clustal Omega-as these methods can accept external guide trees-to produce the final alignments. Results: The results from our approach in the alignment of the sequences were compared with those from MUSCLE and Clustal Omega (with their original guide trees) on the BAliBASE, SABRE, and PREFAB protein sequence databases. For scoring the alignments, we obtained the sum of pairs score and the column score against the reference alignments of the protein benchmark databases used. The alignments produced using the guide trees generated by SARELI obtained statistically superior scores on sum of pairs and column scores than those using the original guide trees from MUSCLE and Clustal Omega on the SABRE and PREFAB databases. Conclusion: Our proposed approach can generate guide trees that can be used by established multiple sequence alignment methods for proteins.
-
-
-
Analysis of the Relative Movements Between EGFR and Drug Inhibitors Based on Molecular Dynamics Simulation
Authors: Lijiang Chen, Bin Zou, Victor H. F. Lee and Hong YanBackground: Mutation of EGFR is one of the most important drivers of non-small cell lung cancer. Many selective therapies take specific mutation of EGFR as target. For example, gefitinib is a commonly used front line drug for the mutations of exon 19 deletions and the L858R. New irreversible inhibitors, such as WZ4002, CO-1686, and AZD9291, are developed to overcome drug resistance caused by the acquired T790M mutation. Objective: In this study, a novel method is proposed to calculate the movement intensities of drug inhibitors relative to EGFR based on molecular dynamics (MD) simulation, in order to find the relationship of movements and drug resistances of gefitinib, WZ4002, CO-1686, and AZD9291. Method: The 4*33 complexes of four inhibitors (gefitinib, WZ4002, CO-1686 and AZD9291) with 32 common EGFR mutations as well as the wild type are analyzed. First, each EGFR-inhibitor complex is fixed to the EGFR backbone. Then each inhibitor is seen as a rigid body. Two kinds of relative movement intensities between EGFR and drug inhibitor are obtained by calculating the attitude parameter of the rigid body. Results: First, for most cases, irreversible inhibitors (WZ4002, CO-1686 and AZD9291) were observed to be more stable than reversible gefitinib, proving our method to be effective. Second, high correlation was obtained between clinical effects and the relative movement intensities. Especially for patients' response level, the correlation P-value was observed to be 0.0462 in the best case. Conclusion: Our method represents an important contribution to molecular dynamics analysis of drug inhibitors. The analysis results of WZ4002, CO-1686 and AZD9291 are useful for drug selection for patients with specific EGFR mutation.
-
-
-
Conformational Hotspots of Dengue Virus NS5 RdRp
Authors: Fawad Khan, Ashfaq Ahmad, Abid Ali, Syed S. Ali and Tayyab Ur RehmanIntroduction: Dengue virus is among the most widespread mosquito-borne human pathogens with 5 different serotypes. A ratio of 3 to 7 structural and non-structural proteins is retained by the 10.7 kb viral RNA genome. The Dengue virus NS5, a non-structural and most conserved protein in the genome plays vital role in virus replication machinery. The C-terminal RNA-dependent RNA polymerase (RdRp) domain of NS5 has been solved experimentally in the canonical right handed conformation that comprises of 3 sub-domains namely finger, palm and thumb. The presence of different structural characteristics portray that RdRp adopts various conformation strategies to fulfill functional modes. Methodology: To understand the molecular switches and signaling pattern that govern conformational functional features of NS5 RdRp domain, long-range dynamic by normal mode analysis coupled with comparative structure analysis and Insilico docking approaches were performed. Results: Our findings state that palm and finger are role playing and flexible sub-domains whereas the C-terminus region of motif B influence signal transmittance and substrate binding. Different motifs of RdRp are trivial in direct conformational transition except two C-terminal residues of motif B (L608 and T611) which modulate path signals. Signalling path indicates that dynamic clusters regulate RdRp allosteric pathway where α10 and α20β6 loop of the finger and thumb sub-domains act as terminals in both directions. Besides, the catalytic site, α16 connects and relay conformational signals to α12 through β5α19 loop. Conclusion: The occurrence of motif B in four dynamic clusters 1, 2, 6 and 7 strengthen our notion further corroborated that all motifs are trivial in direct conformational transition and motif B retains modulation of major conformation signals.
-
-
-
Profiling of Heat-Responsive microRNAs in Creeping Bentgrass (Agrostis stolonifera L.)
Authors: Haizhen Liu, Jian Li, Yongkun Chen, Yan Xu and Jichen XuBackground: MicroRNAs (miRNA) are a class of non-coding single-strand small RNAs and play important roles against abiotic stress. High temperature is one of the major adversities that plants suffered during their growth and development. Objective: The research here was intended to profile the miRNAs in creeping bentgrass to characterize the possible heat resistance mechanisms. Method: Normal and 42132;ƒ high temperature treated for bentgrass were handled by High-throughput small RNA sequencing. We found the known and novel miRNAs and predicted their target genes, using RT-qPCR to confirm the sequencing result. Results: High-throughput small RNA sequencing results showed 182 conserved miRNA sequences of 42 miRNA families and 10 novel miRNAs in creeping bentgrass leaf samples mostly in length of 20 and 21 nucleotides. Bioinformatics analysis indicated that 84 conserved and 4 novel miRNAs were either up- (51) or down (37) regulated by heat, respectively. Gene Ontology enrichment and Kyoto Encyclopedia of Genes and Genomes pathway analysis of the miRNAs' target genes indicated that they were mostly involved in plant hormone signal transduction, plant-pathogen interaction and polyether lipid metabolism. 15 randomly selected heat-responsive miRNAs were examined through real-time quantitative PCR and showed fluctuating expression patterns during heat treatment. Conclusion: The present study successfully gains insight into the gene regulation network to high temperature stress mediated by miRNAs and provides its significant value for stress oriented genetic engineering.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
