Current Bioinformatics - Current Issue
Volume 20, Issue 10, 2025
-
-
Artificial Intelligence in Diabetes Mellitus Prediction: Advancements and Challenges - A Review
More LessAuthors: Rohit Awasthi, Anjali Mahavar, Shraddha Shah, Darshana Patel, Mukti Patel, Drashti Shah and Ashish PatelPoor dietary habits and a lack of understanding are contributing to the rapid global increase in the number of diabetic people. Therefore, a framework that can accurately forecast a large number of patients based on clinical details is needed. Artificial intelligence (AI) is a rapidly evolving field, and its implementations to diabetes, a worldwide pandemic, have the potential to revolutionize the strategy of diagnosing and forecasting this chronic condition. Algorithms based on artificial intelligence fundamentals have been developed to support predictive models for the risk of developing diabetes or its complications. In this review, we will discuss AI-based diabetes prediction. Thus, AI-based new-onset diabetes prediction has not beaten the statistically based risk stratification models, in traditional risk stratification models. Despite this, it is anticipated that in the near future, a vast quantity of well-organized data and an abundance of processing power will optimize AI's predictive capabilities, greatly enhancing the accuracy of diabetic illness prediction models.
-
-
-
NEXT-GEN Medicine: Designing Drugs to Fit Patient Profiles
More LessAuthors: Raj Kamal, Diksha, Priyanka Paul, Ankit Awasthi and Amandeep SinghPersonalized medicine, with its focus on tailoring drug formulations to individual patient profiles, has made significant strides in healthcare. The integration of genomics, biomarkers, nanotechnology, 3D printing, and real-time monitoring provides a comprehensive approach to optimizing drug therapies on an individual basis. This review aims to highlight the recent advancements in personalized medicine and its applications in various diseases, such as cancer, cardiovascular diseases, diabetes mellitus, and neurodegenerative diseases. The review explores the integration of multiple technologies in the field of personalized medicine, including genomics, biomarkers, nanotechnology, 3D printing, and real-time monitoring. As these technologies continue to evolve, we are entering an era of truly personalized medicine that promises improved treatment outcomes, reduced adverse effects, and a more patient-centric approach to healthcare. The advancements in personalized medicine hold great promise for improving patient outcomes and reducing adverse effects, heralding a new era in patient-centric healthcare.
-
-
-
Novel Design for Multi-Epitope Vaccines of COVID-19 and Critical In silico Assessment Steps
More LessAuthors: Tian Lan, Shuquan Su, Pengyao Ping and Jinyan LiIntroductionThe coronavirus disease COVID-19, caused by the SARS-CoV-2 virus, was a global pandemic that happened in March of 2020. The virus was mutated into several widely-spread strains such as Alpha, Beta, Gamma, Delta, and Omicron, and is continuing its unpredictable mutation.
MethodsMulti-Epitope Vaccine (MEV) is one type of recombinant vaccine with its sequence containing multiple epitopes and is considered as an effective way to fight against the infectious disease. Previous in silico approaches to MEV construction have been constrained by their inability to predict molecular conformation structures accurately, consequently leading to inaccurate property evaluations. In this work, we designed a novel MEV for the future prevention of COVID-19 or similar diseases. We set strict thresholds to screen for epitope candidates in order to construct highly effective MEV and use the latest ColabFold (a modified version of AlphaFold2) to predict accurate tertiary structures of the MEV.
ResultsWe especially studied epitopes from the main proteins of SARS-CoV-2 (i.e., the envelope, membrane, nucleo-, and spike proteins) that can provoke immunity response of B-cells, helper T-cells (Th), and cytotoxic T-cells (CTL), then we combined them through amino acid linkers to construct the MEV. We evaluated the vaccine in terms of its physicochemical properties, population coverage, safety for use, secondary and tertiary structure, docking immunity response, and immu nity response eliciting capability.
ConclusionThese in silico assessments demonstrate that our proposed vaccine can elicit effective immune responses and it is safe to use with a high population coverage.
-
-
-
Predicting Molecular Subtypes of Breast Cancer Using Gene Expression Profiling and Random Forest Classifier
More LessBackgroundOne of the main causes of cancer-related mortality in women is breast cancer (BC). There were four molecular subtypes of this malignancy, and adjuvant therapy efficacy differed based on these subtypes. Gene expression profiles provide valuable information that is helpful for patients whose prognosis is not clear from clinical markers and immunohistochemistry.
ObjectiveIn this study, we aim to predict molecular types of BC using a gene expression dataset of patients with BC and normal samples using six well-known ensemble machine-learning techniques.
MethodsTwo microarray datasets were downloaded; (GSE45827) and (GSE140494) from the Gene Expression Omnibus (GEO) database. These datasets comprise 21 samples of normal tissues that were part of a cohort analysis of primary invasive breast cancer (57 basal, 36 HER2, 56 Luminal A, and 66 Luminal B). Namely, we used AdaBoost, Random Forest (RF), Artificial Neural Network (ANN), Naïve Bayes (NB), Classification and Regression Tree (CART), and Linear Discriminant Analysis (LDA) classifiers.
ResultsThe results of the data analysis show that the RF and NB classifiers outperform the other models in the prediction of the BC subtype. The RF shows superior performance with an accuracy range between 0.89 and 1.0 in contrast to its competitor NB, which has an average accuracy of 0.91. Our approach perfectly discriminates un-affected cases (normal) from the carcinoma. In this case, the RF provides perfect prediction with zero errors. Additionally, we used PCA, DHWT low-frequency, and DHWT high-frequency to perform a dimensional reduction for the numerous gene expression values. Consequently, the LDA achieves up to 95% improvement in performance through data reduction. Moreover, feature selection allowed for the best performance, which is recorded by the RF with classification accuracy 98%.
ConclusionOverall, we provide a successful framework that leads to shorter computation times and smaller ML models, especially where memory and time restrictions are crucial.
-
-
-
scADCA: An Anomaly Detection-Based scRNA-seq Dataset Cell Type Annotation Method for Identifying Novel Cells
More LessAuthors: Yongle Shi, Yibing Ma, Xiang Chen and Jie GaoBackgroundWith the rapid evolution of single-cell RNA sequencing technology, the study of cellular heterogeneity in complex tissues has reached an unprecedented resolution. One critical task of the technology is cell-type annotation. However, challenges persist, particularly in annotating novel cell types.
ObjectiveCurrent methods rely heavily on well-annotated reference data, using correlation comparisons to determine cell types. However, identifying novel cells remains unstable due to the inherent complexity and heterogeneity of scRNA-seq data and cell types. To address this problem, we propose scADCA, a method based on anomaly detection, for identifying novel cell types and annotating the entire dataset.
MethodsThe convolutional modules and fully connected networks are integrated into an autoencoder, and the reference dataset is trained to obtain the reconstruction errors. The threshold based on these errors can distinguish between novel and known cells in the query dataset. After novel cells are identified, a multinomial logistic regression model fully annotates the dataset.
ResultsUsing a simulation dataset, three real scRNA-seq pancreatic datasets, and a real scRNA-seq lung cancer cell line dataset, we compare scADCA with six other cell-type annotation methods, demonstrating competitive performance in terms of distinguished accuracy, full accuracy, F1 -score, and confusion matrix.
ConclusionIn conclusion, the scADCA method can be further improved and expanded to achieve better performance and application effects in cell type annotation, which is helpful to improve the accuracy and reliability of cytology research and promote the development of single-cell omics.
-
-
-
A Method of Enhancing Heterogeneous Graph Representation for Predicting the Associations between lncRNAs and Diseases
More LessAuthors: Dengju Yao, Yuehu Wu and Xiaojuan ZhanBackgroundLong non-coding RNAs (lncRNAs) are a category of more extended RNA strands that lack protein-coding abilities. Although they are not involved in the translation of proteins, studies have shown that they play essential regulatory functions in cells, regulating gene expression and cell biological processes. However, it is both costly and inefficient to determine the associations between lncRNAs and diseases through biological experiments. Therefore, there is an urgent need to develop convenient and fast computational methods to predict lncRNA-disease associations (LDAs) more efficiently.
ObjectivePredicting disease-associated lncRNAs can help explore the mechanisms of action of lncRNAs in diseases, and this is crucial for early intervention and treatment of diseases.
MethodsIn this paper, we propose an enhanced heterogeneous graph representation method for predicting LDAs, named GCGALDA. The GCGALDA first obtains the topological structure features of nodes by a biased random walk. Based on this, the neighboring nodes of a node are weighted using the attention mechanism to further mine the semantic association relationships between nodes in the graph data. Then, a graph convolution network (GCN) is used to transfer the neighborhood features of the node to the central node and combine them with the node's features so that the final node representation contains not only structural information but also semantic association information. Finally, the association score between lncRNA and disease is obtained by multilayer perceptron (MLP).
ResultsAs evidenced by the experimental findings, the GCGALDA outperforms other advanced models in terms of prediction accuracy on openly accessible databases. In addition, case studies on several human diseases further confirm the predictive ability of the GCGALDA.
ConclusionIn conclusion, the proposed GCGALDA model extracts multi-perspective features, such as topology, semantic association, and node attributes, obtains high-quality heterogeneous graph node representations, and effectively improves the performance of the LDA prediction model.
-
-
-
GVNNVAE: A Novel Microbe-Drug Association Prediction Model based on an Improved Graph Neural Network and the Variational Auto-Encoder
More LessAuthors: Yiming Chen, Zhen Zhang, Xin Liu, Bin Zeng and Lei WangMicroorganisms play a crucial role in human health and disease. Identifying potential microbe-drug associations is essential for drug discovery and clinical treatment. In this manuscript, we proposed a novel prediction model named GVNNVAE by combining an Improved Graph Neural Network (GNN) and the Variational Auto-Encoder (VAE) to infer potential microbe-drug associations. In GVNNVAE, we first established a heterogeneous microbe-drug network N by integrating multiple similarity metrics of microbes, drugs, and diseases. Subsequently, we introduced an improved GNN and the VAE to extract topological and attribute representations for nodes in N respectively. Finally, through incorporating various original attributes of microbes and drugs with above two kinds of newly obtained topological and attribute representations, predicted scores of potential microbe-drug associations would be calculated. Furthermore, To evaluate the prediction performance of GVNNVAE, intensive experiments were done and comparative results showed that GVNNVAE could achieve a satisfactory AUC value of 0.9688, which outperformed existing competitive state-of-the-art methods. And moreover, case studies of known microbes and drugs confirmed the effectiveness of GVNNVAE as well, which highlighted its potential for predicting latent microbe-drug associations.
-
-
-
A Non-invasive Cell-free DNA Diagnosis Method for Hepatocellular Carcinoma Based on Deep Learning
More LessAuthors: Xueyi Li, Wei Zhang and Zhi-Ping LiuIntroduction/ObjectiveHepatocellular Carcinoma (HCC) is a major disease that seriously threatens human health. Early screening can significantly improve the five-year survival rate of HCC patients. Cell-free DNA (cfDNA), as a potential carrier of cancer signals in body fluids, can be used for early cancer detection. However, current early HCC detection methods based on cfDNA sequencing require deep sequencing data, limiting their application and usage in routine disease screening. We proposed a foundational DNA language model, called CLHCC, for analyzing DNA sequences and methylation patterns to detect HCC at low sequencing depths.
MethodsCLHCC randomly selected 1500 DNA fragments from HCC-specific differentially methylated regions identified by cd-score. The model then performed a one-hot encoding strategy on these DNA fragments and input the data into a CNN combined with an LSTM neural network for classification.
ResultsWe tested CLHCC on 2139 target-BS data samples, achieving an accuracy of 84.59% (precision: 83.44%, recall: 81%) under 10-fold cross-validations. This performance is better than DNA language models built using CNN or LSTM alone.
ConclusionOur study applies deep learning to analyze DNA sequences in specific methylation regions without the need for complex alignment processes. This provides new theoretical and practical guidance for clinical applications and holds promise for non-invasive early HCC screening via cfDNA.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month Most Read RSS feed