Current Bioinformatics - Current Issue
Volume 20, Issue 7, 2025
-
-
A Novel Method for Mining Regulatory sRNAs Related to Rice Resistance Against Blast Fungus from Multi-Omics Data
Authors: Jianhua Sheng, Enshuang Zhao, Yuheng Zhu, Yinfei Dai, Borui Zhang, Qingming Qin and Hao ZhangBackgroundDue to infection by the rice blast fungus, rice, a major global staple, faces yield challenges. While chemical control methods are common, their environmental and economic costs are growing concerns. Traditional biological experiments are also inefficient for exploring resistance genes. Therefore, understanding the interaction between rice and the rice blast fungus is urgent and important.
ObjectiveThis study aims to use multi-omics data to uncover key elements in rice's defense against rice blast fungus Magnaporthe oryzae. We built a detailed, multi-layered heterogeneous interaction network, employing an innovative graph embedding feature with a cross-layer random walk algorithm to identify crucial crucial resistance factors. This could inform strategies for enhancing disease resistance in rice.
MethodsWe integrated genomics, transcriptomics, and proteomics data on Magnaporthe oryzae infecting rice. This multi-omics data was used to construct a multi-layer heterogeneous network. An advanced graph embedding algorithm (BINE) provided rich vector representations of network nodes. A multi-layer network walking algorithm was then used to analyze the network and identify key regulatory small RNA (sRNAs) in rice.
ResultsNode similarity rankings allowed us to identify significant regulatory sRNAs in rice that are integral to disease resistance. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses further revealed their roles in biological processes and key metabolic pathways. Our integrative method precisely and efficiently identified these crucial elements, offering a valuable systems biology tool.
ConclusionBy integrating multi-omics data with computational analysis, this study reveals key regulatory sRNAs in rice's disease resistance mechanism. These findings enhance our understanding of rice disease resistance and provide genetic resources for breeding disease-resistant rice. Despite limitations in sRNA functional interpretation, this research demonstrates the power of applying multi-omics data to address complex biological problems.
-
-
-
Comparison between Ribosomal Assembly and Machine Learning Tools for Microbial Identification of Organisms with Different Characteristics
BackgroundGenome assembly tools are used to reconstruct genomic sequences from raw sequencing data, which are then used for identifying the organisms present in a metagenomic sample.
MethodologyMore recently, machine learning approaches have been applied to a variety of bioinformatics problems, and in this paper, we explore their use for organism identification. We start by evaluating several commonly used metagenomic assembly tools, including PhyloFlash, MEGAHIT, MetaSPAdes, Kraken2, Mothur, UniCycler, and PathRacer, and compare them against state-of-the-art deep learning-based machine learning classification approaches represented by DNABERT and DeLUCS, in the context of two synthetic mock community datasets.
ResultsOur analysis focuses on determining whether ensembling metagenome assembly tools with machine learning tools have the potential to improve identification performance relative to using the tools individually.
ConclusionWe find that this is indeed the case, and analyze the level of effectiveness of potential tool ensembling for organisms with different characteristics (based on factors such as repetitiveness, genome size, and GC content).
-
-
-
MBPathNCP: A Metabolic Pathway Prediction Model for Chemicals and Enzymes Based on Network Consistency Projection
More LessBackgroundMetabolic pathway is an important biological pathway in living organisms as it produces necessary energy to maintain vital movement. Although main part of metabolic pathway has been uncovered by the great efforts in recent years, its completeness is still a problem. The undetected chemical reactions in metabolic pathway have become a hinder for better understanding on its mechanism. Prediction of metabolic pathways that a chemical or enzyme can participate in is the first step to remove this hinder.
ObjectiveThis study aimed to design an effective computational method to predict the metabolic pathways of chemicals and enzymes.
MethodsA new computational model was proposed to predict the metabolic pathways of chemicals and enzymes, which was called MBPathNCP. The kernels for chemicals/enzymes and pathways were constructed using the interactions of chemicals and proteins, and the validated associations between chemicals/enzymes and pathways. The network consistency projection was applied to the kernels and association adjacency matrix to yield the association score for each pair of chemicals/enzymes and pathways.
ResultsCross-validation results on this model shown its good performance. The further tests indicated the reasonability of the entire architecture and its superiority when the negative samples were much more than positive samples.
ConclusionThe proposed model MBPathNCP was efficient to predict the metabolic pathways of chemicals and enzymes and can be a latent useful tool to investigate metabolic pathway system.
-
-
-
Evaluating the Reliability of Machine Learning Predictors in m6A-SNP Association Analysis: A Comparative Study Using m6A-QTL Data
Authors: Zhongzheng Mao and Zhen WeiIntroductionN6-Methyladenosine (m6A) plays a crucial role in determining the fate of RNA after transcription. Understanding the downstream functions of individual m6A sites is of critical interest in epitranscriptomics. In published studies, two main approaches have been used to decipher the specific impact of m6A sites on gene expression and disease/traits: the m6A quantitative trait loci (m6A-QTL) and in-silico mutation prediction by Machine Learning (ML) models. However, earlier works still lack independent validation for the performance of ML-based methods.
MethodsIn this study, we use m6A-QTL as ground truth to evaluate the outcomes of in-silico mutation models. We benchmark both the newly trained machine learning models using genomic or sequence features and the existing model inference results published in in-silico mutation-dependent databases against m6A-QTL.
ResultsWe found that the consistency between in-silico mutation and m6A-QTL is weak, regardless of the ML algorithms and predictive features used. This trend was also similar across multiple published databases based on in-silico mutation, including RMDisease2, m6AVar, and RMVar.
ConclusionThese results emphasize the importance of critical empirical evaluations for ML models in future SNP-m6A association studies and suggest the need for more high-quality m6A-QTL experiments to guide model development.
-
-
-
MGCN-PolyA: An Integrated Computational Framework for Predicting Poly(A) Signals with Multiscale-gated Convolutional Networks
Authors: Jujuan Zhuang, Wanquan Gao, Xinru Huang and Guoyan ChenBackgroundThe accurate recognition of the polyadenylation signal (PAS) from DNA sequences is essential for understanding gene transcriptional regulation. A variety of machine learning-based computational methods have been developed to predict PAS in recent years; however, their performance and their generalization ability are unsatisfactory. It is highly desirable to design more preferable computational approaches for PAS prediction.
MethodsIn this work, we developed an integrated framework MGCN-PolyA for PAS prediction across four species, including Homo sapiens, Bos taurus, Mus musculus, and Drosophila melanogaster. MGCN-Poly(A) benefits from the diversity of feature engineering and the effectiveness of the model architecture. We combined features from different perspectives, such as word embedding, One-hot encoding, K-mer frequency, and Enhanced Nucleic Acid Composition (ENAC), which complement each other and provide rich and comprehensive information for model learning. In model architecture, MGCN-Poly(A) leverages a two-channel multi-scale gated convolutional network to effectively learn high-level feature representations at different scales, and then combines the statistical features to predict PAS using random forest algorithm. These designs not only speed up network training, but also improves the generalization ability.
ResultsThe benchmarking experiments on the independent test datasets demonstrate that MGCN-PolyA outperforms other state-of-the-art algorithms in identifying PAS. MGCN-PolyA has the highest accuracy on all test datasets, and its excellent performance on cross-species validation also demonstrates the robustness of our model.
ConclusionExtracting features from different perspectives is important for PAS recognition, and the integration of DNNs and shallow machine learning algorithms can improve the model performance.
-
-
-
A TTN-Associated Immunoprognostic Model Based on LASSO-Cox Regression for Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma
Authors: Tianjin Dai, Peng Chen, Jun Zhang and Bing WangBackgroundTTN mutations are the most common genetic mutations found in cervical squamous cell carcinoma and endocervical adenocarcinoma. They have been shown to affect the progression and prognosis of Cervical Endometrial glandular carcinoma (CESC). TTN mutations may also regulate the immune phenotype of CESC, which could impact its prognosis. Previous studies have demonstrated that CESC patients with TTN mutations had a significantly higher overall survival rate than those with wild-type TTN. However, the impact of TTN mutations on the immune microenvironment of CESC has not been thoroughly investigated.
MethodsThis paper aims to examine the TTN mutation status and RNA expression in the CESC dataset from TCGA. Two gene features were identified to predict the prognosis of CESC. Consequently, a CESC Immune Prognosis Model (CIPM) based on a LASSO-Cox regression analysis was developed for the differential expression of immune-related genes between TTN-WT and TTN-MUT CESC samples.
ResultsThe results showed that TTN mutations weaken the immune response in CESCs. Out of the 152 genes associated with the immune response, 21 displayed varying expression levels depending on the presence or absence of TTN mutations.
ConclusionThe study suggests that TTN mutations have an impact on the immune response in CESCs. The CIPM was introduced and validated in 232 CESC patients to distinguish between high- and low-risk patients with an unsatisfactory prognosis, regardless of various clinical features.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month Most Read RSS feed
