Volume 20, Issue 6

Current Bioinformatics - Volume 20, Issue 6, 2025

Volume 20, Issue 6, 2025

- Recent Progress of Deep Learning Methods for RBP Binding Sites Prediction on circRNA
  
  Authors: Zhengfeng Wang, Xiujuan Lei, Yuchen Zhang, Fang-Xiang Wu and Yi Pan
  
  https://doi.org/10.2174/0115748936308564240712053215
  More Less
  
  The interaction between circular RNA (circRNA) and RNA binding protein (RBP) plays an important biological role in the occurrence and development of various diseases. High-throughput biological experimental methods such as CLIP-seq can effectively analyze the interaction between the two, but biological experiments are inefficient and expensive, and they can only capture binding sites of a specific RBP on circRNA in a selected cell environment at a time. These biological experiments still rely on downstream data analysis to understand the mechanisms behind many biological structures and physiological processes. However, the rapid growth of experimental data dimensions and production speed pose challenges to traditional analysis methods. In recent years, deep learning has made great progress in the genome and transcriptome, and some deep learning prediction algorithms for RBP binding sites on circRNA have also emerged. In this paper, we briefly introduce some biological background knowledge related to circRNA-RBP interaction; present relevant deep learning techniques in this field, including the problem formulation, data source, sequence encoding, deep learning model and overall process of RBP binding sites prediction on circRNA; deeply analyze the current deep learning methods. Finally, some problems existing in the current research and the direction of future research are discussed. It is hoped to help researchers without basic knowledge of deep learning or basic biological background quickly understand the RBP binding sites prediction on circRNA.
  
  Add to my favourites
  
  Email this

- Multinomial Logistic Regression with Adaptive Regularization for Cancer Subtype Classification via Multi-omics Data
  
  Authors: Yingdi Wu, Fuzhen Cao and Juntao Li
  
  https://doi.org/10.2174/0115748936308171240605075531
  More Less
  
  Background
  Integrating multi-omics data for cancer classification brings complementary biological insights while also facing challenges such as data integration, gene grouping, and adaptive weight construction.
  Objective
  This paper aims to address the challenges faced by the cancer subtype classification and gene screening based on multi-omics data.
  Methods
  Multinomial logistic regression with adaptive regularization (MLRAR) was proposed by integrating DNA methylation, gene mutation, and RNA-seq information. A data preprocessing strategy that effectively utilizes multi-omics information was presented, and the local maximum quasi-clique merging (lmQCM) algorithm was implemented to group genes. Biological pathway information was utilized to evaluate the significance of gene groups, while the significance of each gene within a group was evaluated by integrating mutation information, information theory, and methylation information.
  Results
  Compared to MRlasso, MRGL, MSGL, MROGL, AMRSOGL, and AGLRMR, the proposed method yielded improvements in subtype classification accuracy of breast cancer by 2.6%, 2.9%, 3.5%, 2.3%, 2.0%, and 1.8%, respectively. In addition, MLRAR also achieved significant improvements in ovarian cancer by 8.2%, 5.0%, 6.8%, 5.2%, 12.7%, and 6.3%, respectively.
  Conclusion
  The proposed method can effectively deal with data integration, gene grouping, and adaptive weight construction.
  
  Add to my favourites
  
  Email this

- GenRepAI: Utilizing Artificial Intelligence to Identify Repeats in Genomic Suffix Trees
  
  By Freeson Kaniwa
  
  https://doi.org/10.2174/0115748936303435240702112205
  More Less
  
  Background
  The human genome is densely populated with repetitive DNA sequences that play crucial roles in genomic functions and structures but are also implicated in over 40 human diseases. The computational challenge of identifying and characterizing these repeats is significant due to the complexity and size of the genome, which are overwhelming traditional algorithms.
  Methods
  To address these challenges, we propose GenRepAI, a deep learning framework to navigate and analyze genomic suffix trees. GenRepAI employs supervised machine learning classifiers trained on labeled datasets of repeat annotations and unsupervised anomaly detection to identify novel repeat sequences. The models are trained using convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and vision transformers to classify and annotate repeats within the human genome.
  Results
  GenRepAI is designed to comprehensively profile repeats that underlie various neurological diseases, allowing researchers to identify pathogenic expansions. The framework will integrate into existing genomic analysis pipelines, with the capability to screen patient genomes and highlight potential causal variants for further validation.
  Conclusion
  GenRepAI is set to become a foundational tool in genomics, leveraging artificial intelligence to enhance the characterization of repetitive sequences. It promises significant advancements in the molecular diagnosis of repeat expansion disorders and contributes to a deeper understanding of genomic structure and function, with broad applications in personalized medicine.
  
  Add to my favourites
  
  Email this

- CNRBind: Small Molecule-RNA Binding Sites Recognition via Site Significant from Nucleotide and Complex Network Information
  
  Authors: Lichao Zhang, Kang Xiao, Xueting Wang and Liang Kong
  
  https://doi.org/10.2174/0115748936296412240625111040
  More Less
  
  Background
  Small molecule-RNA binding sites play a significant role in developing drugs for disease treatment. However, it is a challenge to propose accurate computational tools for identifying these binding sites.
  Methods
  In this study, an accurate prediction model named CNRBind was constructed by extracting site significant information from nucleotide and complex networks. We designed complex networks and calculated three topological structural parameters according to RNA tertiary structure. Acknowledging nucleotide interdependence, a sliding window was selected to integrate the influence of adjacent sites. Finally, the model was constructed using a random forest classifier.
  Results
  Compared to the other computational tools, CNRBind was competitive and had excellent discriminative ability for metal ion-binding site prediction. Furthermore, statistic analysis revealed significant differences between CNRBind and existing methods. Additionally, CNRBind is a promising predictor in cases where experimental tertiary structure is unavailable.
  Conclusion
  These results show that CNRBind is effective because of the proposed site significant information encoding strategy. The approach provides a reasonable supplement for biology researches. The dataset and resource codes can be accessed at: https://github.com/Kangxiaoneuq/CNRBind.
  
  Add to my favourites
  
  Email this

- Prediction of miRNA-disease Associations by Deep Matrix Decomposition Method based on Fused Similarity Information
  
  Authors: Xia Chen, Qiang Qu, Xiang Zhang, Hao Nie, Xiuxiu Chao, Weihao Ou, Haowen Chen and Xiangzheng Fu
  
  https://doi.org/10.2174/0115748936300759240712061707
  More Less
  
  Aim
  MicroRNAs (miRNAs), pivotal regulators in various biological processes, are closely linked to human diseases. This study aims to propose a computational model, SIDMF, for predicting miRNA-disease associations.
  Background
  Computational methods have proven efficient in predicting miRNA-disease associations, leveraging functional similarity and network-based inference. Machine learning techniques, including support vector machines, semi-supervised algorithms, and deep learning models, have gained prominence in this domain.
  Objective
  Develop a computational model that integrates disease semantic similarity and miRNA functional similarity within a deep matrix factorization framework to predict potential associations between miRNAs and diseases accurately.
  Methods
  SIDMF, introduced in this study, integrates disease semantic similarity and miRNA functional similarity within a deep matrix factorization framework. Through the reconstruction of the miRNA-disease association matrix, SIDMF predicts potential associations between miRNAs and diseases.
  Results
  The performance of SIDMF was evaluated using global Leave-One-Out Cross-Validation (LOOCV) and local LOOCV, achieving high Area Under the Curve (AUC) values of 0.9536 and 0.9404, respectively. Comparative analysis against other methods demonstrated the superior performance of SIDMF. Case studies on breast cancer, esophageal cancer, and prostate cancer further validated SIDMF's predictive accuracy, with a substantial percentage of the top 50 predicted miRNAs confirmed in relevant databases.
  Conclusion
  SIDMF emerges as a promising computational model for predicting potential associations between miRNAs and diseases. Its robust performance in global and local evaluations, along with successful case studies, underscores its potential contributions to disease prevention, diagnosis, and treatment.
  
  Add to my favourites
  
  Email this

- TCM@MPXV: A Resource for Treating Monkeypox Patients in Traditional Chinese Medicine
  
  Authors: Xin Zhang, Feiran Zhou, Pinglu Zhang, Quan Zou and Ying Zhang
  
  https://doi.org/10.2174/0115748936299878240723044438
  More Less
  
  Introduction
  Traditional Chinese Medicine (TCM) has been extensively employed in the treatment of Monkeypox Virus (MPXV) infections, and it has historically played a significant role in combating diseases like contagious pox-like viral diseases in China.
  Methods
  Various traditional Chinese medicine (TCM) therapies have been recommended for patients with monkeypox virus (MPXV). However, as far as we know, there is no comprehensive database dedicated to preserving and coordinating TCM remedies for combating MPXV. To address this gap, we introduce TCM@MPXV, a carefully curated repository of research materials focusing on formulations with anti-MPXV properties. Importantly, TCM@MPXV extends its scope beyond herbal remedies, encompassing mineral-based medicines as well.
  Results
  The current iteration of TCM@MPXV boasts an impressive array of features, including (1) Documenting over 42 types of TCM herbs, with more than 27 unique herbs; (2) Recording over 285 bioactivity compounds within these herbs; (3) Launching a user-friendly web server for the docking, analysis, and visualization of 2D or 3D molecular structures; and (4) Providing 3D structures of druggable proteins of MPXV.
  Conclusion
  To summarize, TCM@MPXV presents a user-friendly and effective platform for recording, querying, and viewing anti-MPXV TCM resources and will contribute to the development and explanation of novel anti-MPXV mechanisms of action to aid in the ongoing battle against monkeypox. TCM@MPXV is accessible for academic use at http://101.34.238.132:5000/.
  
  Add to my favourites
  
  Email this

- A Parallel Implementation for Large-Scale TSR-based 3D Structural Comparisons of Protein and Amino Acid
  
  Authors: Feng Chen, Tarikul I. Milon, Poorya Khajouie, Antoinette Myers and Wu Xu
  
  https://doi.org/10.2174/0115748936306625240724102438
  More Less
  
  Background
  Proteins play a vital role in sustaining life, requiring the formation of specific 3D structures to manifest their essential biological functions. Structure comparison techniques are benefiting from the ever-expanding repositories of the Protein Data Bank. The development of computational tools for protein and amino acid 3D structural comparisons plays an important role in understanding protein functions. The Triangular Spatial Relationship (TSR)-based was developed for such purpose.
  Methods
  A parallelization strategy and actual implementation on high-performance clusters using the distributed and shared memory programming model, along with the utilization of multi-core CPU and many-core GPU accelerators, were developed. 3D structures of proteins and amino acids are represented by an integer vector in the TSR-based method. This parallelization strategy is designed for the TSR-based method for large-scale 3D structural comparisons of proteins and amino acids in this study. It can also be adapted to other applications where a vector type of data structure is used.
  Results
  Due to the nature of the vector representation of protein and amino acid structures using the TSR-based method, the comparison algorithm is well-suited for parallelization on large scale supercomputers. Performance studies on the representative datasets were conducted to demonstrate the efficiency of the parallelization strategy. It allows comparisons of large 3D protein or amino acid structure datasets to finish within a reasonable amount of time.
  Conclusion
  The case studies, by taking advantage of this parallelization code, demonstrate that applying either mirror image or feature selection in the TSR-based algorithms improves the classifications of protein and amino acid 3D structures. The TSR keys have the advantage of performing structure-based BLAST searches. The parallelization code could be used as a reference for similar future studies.
  
  Add to my favourites
  
  Email this

- Corrigendum to: An Exploratory Review on Recent Computational Approaches Devised for MiRNA Disease Association Prediction
  
  Authors: S. Sujamol, E.R. Vimina and U. Krishnakumar
  
  https://doi.org/10.2174/1574893620999250214162859
  More Less
  
  Add to my favourites
  
  Email this

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Current Bioinformatics - Volume 20, Issue 6, 2025

Volume 20, Issue 6, 2025

Volumes & issues

Most Read This Month

Most Cited Most Cited RSS feed