Current Bioinformatics - Volume 18, Issue 6, 2023
Volume 18, Issue 6, 2023
-
-
Biomarkers Identification of Hepatocellular Carcinoma Based on Multiomics Data Integration and Graph-embedded Deep Neural Network
Authors: Chaokun Yan, Mengyuan Li, Zhihao Suo, Jun Zhang, Jianlin Wang, Ge Zhang, Wenjuan Liang and Huimin LuoBackground: Hepatocellular carcinoma (HCC) is one of the malignancies with high mortality rate, and identify relevant biomarkers of HCC is helpful for early diagnosis and patient care. Though some high-dimensional omic data contains intrinsic biomedical information about HCC, how to integrate analysis them effectively and find promising biomarkers of HCC is still an important and difficult issue. Methods: We present a novel biomarker identification approach, named GEDNN, based on multi-omic data and graph-embedded deep neural network. To achieve a more comprehensive understanding of HCC, we first collected and normalized the three following types of HCC-related data: DNA methylation, copy number variation (CNV), and gene expression. The ANOVA was adopted to filter out redundant genes. Then, we measured the connectivity between gene pairs by Pearson correlation coefficient of gene pairs, and further construct gene graph. Next, graph-embedded feedforward neural network (DFN) and back-propagation of convolutional neural network (CNN) were combined to integratively analyze the three types of omics data and achieve the importance score of gene biomarkers. Results: Extensive experimental results showed that the biomarkers screened by the proposed method were effective in classifying and predicting HCC. Furthermore, the gene analysis further showed that the biomarkers screened by our method were strongly associated with the development of HCC. Conclusion: In this paper, we propose the GEDNN method to assess the importance of genes for more accurate identification of cancer biomarkers, which facilitates the effective classification of cancers. The proposed method is applied to multi-omics data of HCC, including RNASeq, DNAMeth and CNV, considering the complementary information between different types of data. We construct a gene graph by Pearson correlation coefficients as additional information for DFN, thus reducing the importance score of redundant genes. In addition, the proposed method also incorporates back-propagation of CNN to further obtain the importance of features.
-
-
-
Detection of Stage-wise Biomarkers in Lung Adenocarcinoma Using Multiplex Analysis
Authors: Athira K, Sunil Kumar P V, Manju M and Gopakumar GIntroduction: Lung cancer is the leading cancer in terms of morbidity and mortality rate. Its prevalence has been steadily increasing over the world in recent years. An integrated study is unavoidable to analyse the cascading interrelationships between molecular cell components at multiple levels resulting in hidden biological events in cancer. Methods: Multiplex network modeling is a unique methodology that could be used as an integrative method for dealing with diverse interactions. Here, we have employed a multiplex framework to model the lung adenocarcinoma (LUAD) network by incorporating co-expression correlations, methylation relations, and protein physical binding interactions as network layers. Hub nodes identified from the multiplex network utilizing centrality measures, including degree, eigenvector, and random walk with a random jump technique, are considered as biomarker genes. These stage-wise biomarker genes identified for LUAD are investigated using GO enrichment analysis, pathway analysis, and literature evidence to determine their significance in tumor progression. Results: The study has identified a set of stage-specific biomarkers in LUAD. The 31 genes identified from the results of multiple centrality analysis can be targeted as novel diagnostic biomarkers in LUAD. Multiple signaling pathways identified here may be considered as potential targets of interest. Conclusion: Based on the analysis results, patients may be identified by their stage of cancer progression, which can aid in treatment decision-making.
-
-
-
Exploring the Hepatotoxicity of Drugs through Machine Learning and Network Toxicological Methods
Authors: Tiantian Tang, Xiaofeng Gan, Li Zhou, Kexue Pu, Hong Wang, Weina Dai, Bo Zhou, Lingyun Mo and Yonghong ZhangBackground: The prediction of the drug-induced liver injury (DILI) of chemicals is still a key issue of the adverse drug reactions (ADRs) that needs to be solved urgently in drug development. The development of a novel method with good predictive capability and strong mechanism interpretation is still a focus topic in exploring the DILI. Objective: With the help of systems biology and network analysis techniques, a class of descriptors that can reflect the influence of drug targets in the pathogenesis of DILI is established. Then a machine learning model with good predictive capability and strong mechanism interpretation is developed between these descriptors and the toxicity of DILI. Methods: After overlapping the DILI disease module and the drug-target network, we developed novel descriptors according to the number of drug genes with different network overlapped distance parameters. The hepatotoxicity of drugs is predicted based on these novel descriptors and the classical molecular descriptors. Then the DILI mechanism interpretations of drugs are carried out with important network topological descriptors in the prediction model. Results: First, we collected targets of drugs and DILI-related genes and developed 5 NT parameters (S, Nds=0, Nds=1, Nds=2, and N2) based on their relationship with a DILI disease module. Then hepatotoxicity predicting models were established between the above NT parameters combined with molecular descriptors and drugs through the machine learning algorithms. We found that the NT parameters had a significant contribution in the model (ACCtraining set=0.71, AUCtraining set=0.76; ACCexternal set=0.79, AUCexternal set=0.83) developed by these descriptors within the applicability domain, especially for Nds=2, and N2. Then, the DILI mechanism of acetaminophen (APAP) and gefitinib are explored based on their risk genes related to ds=2. There are 26 DILI risk genes in the regulation of cell death regulated with two steps by 5 APAP targets, and gefitinib regulated risk gene of CLDN1, EIF2B4, and AMIGO1 with two steps led to DILI which fell in the biological process of response to oxygen-containing compound, indicating that different drugs possibly induced liver injury through regulating different biological functions. Conclusion: A novel method based on network strategies and machine learning algorithms successfully explored the DILI of drugs. The NT parameters had shown advantages in illustrating the DILI mechanism of chemicals according to the relationships between the drug targets and the DILI risk genes in the human interactome. It can provide a novel candidate of molecular descriptors for the predictions of other ADRs or even of the predictions of ADME/T activity.
-
-
-
Graph Convolutional Neural Network with Multi-Layer Attention Mechanism for Predicting Potential Microbe-Disease Associations
Authors: Lei Wang, Xiaoyu Yang, Linai Kuang, Zhen Zhang, Bin Zeng and Zhiping ChenBackground: Human microbial communities play an important role in some physiological process of human beings. Nevertheless, the identification of microbe-disease associations through biological experiments is costly and time-consuming. Hence, the development of calculation models is meaningful to infer latent associations between microbes and diseases. Aims: In this manuscript, we aim to design a computational model based on the Graph Convolutional Neural Network with Multi-layer Attention mechanism, called GCNMA, to infer latent microbe-disease associations. Objective: This study aims to propose a novel computational model based on the Graph Convolutional Neural Network with Multi-layer Attention mechanism, called GCNMA, to detect potential microbedisease associations. Methods: In GCNMA, the known microbe-disease association network was first integrated with the microbe- microbe similarity network and the disease-disease similarity network into a heterogeneous network first. Subsequently, the graph convolutional neural network was implemented to extract embedding features of each layer for microbes and diseases respectively. Thereafter, these embedding features of each layer were fused together by adopting the multi-layer attention mechanism derived from the graph convolutional neural network, based on which, a bilinear decoder would be further utilized to infer possible associations between microbes and diseases. Results: Finally, to evaluate the predictive ability of GCNMA, intensive experiments were done and compared results with eight state-of-the-art methods which demonstrated that under the frameworks of both 2-fold cross-validations and 5-fold cross-validations, GCNMA can achieve satisfactory prediction performance based on different databases including HMDAD and Disbiome simultaneously. Moreover, case studies on three kinds of common diseases such as asthma, type 2 diabetes, and inflammatory bowel disease verified the effectiveness of GCNMA as well. Conclusion: GCNMA outperformed 8 state-of-the-art competitive methods based on the benchmarks of both HMDAD and Disbiome.
-
-
-
piRSNP: A Database of piRNA- related SNPs and their Effects on Cancerrelated piRNA Functions
Authors: Yajun Liu, Aimin Li, Yingda Zhu, Xinchao Pang, Xinhong Hei, Guo Xie and Fang-Xiang WuBackground: PIWI-interacting RNAs (piRNAs) are a kind of small non-coding RNAs which interact with PIWI proteins and play a vital role in safeguarding genome. Single nucleotide polymorphisms (SNPs) are widely distributed variations which are associated with diseases and have rich information. Up to now, various studies have proved that SNPs on piRNA were related to diseases. Objective: In order to create a comprehensive source about piRNA-related SNPs, we developed a publicly available online database piRSNP. Methods: We systematically identified SNPs on human and mouse piRNAs. piRSNP contains 42,967,522 SNPs on 10,773,081 human piRNAs and 29,262,185 SNPs on 16,957,706 mouse piRNAs. Results: 7,446 SNPs on 519 cancer-related piRNAs and their flanks are investigated. Impacts of 2,512 variations of cancer-related piRNAs on piRNA-mRNA interactions are analyzed. Conclusion: All these useful data and piRNA expression profiles of 12 cancer types in both tumor and pericarcinomatous tissues are compiled into piRSNP. piRSNP characterizes human and mouse piRNArelated SNPs comprehensively and could be beneficial for researchers to investigate subsequent piRNA functions. Database URL is http://www.ibiomedical.net/piRSNP/.
-
-
-
A Skin Cancer Detector Based on Transfer Learning and Feature Fusion
Authors: Hongguo Cai, Norriza Brinti Hussin, Huihong Lan and Hong LiBackground: With the rapid development of advanced artificial intelligence technologies which have been applied in varying types of applications, especially in the medical field. Cancer is one of the biggest problems in medical sciences. If cancer can be detected and treated early, the possibility of a cure will be greatly increased. Malignant skin cancer is one of the cancers with the highest mortality rate, which cannot be diagnosed in time only through doctors’ experience. We can employ artificial intelligence algorithms to detect skin cancer at an early stage, for example, patients are determined whether suffering from skin cancer by detecting skin damage or spots. Objective: We use the real HAM10000 image dataset to analyze and predict skin cancer. Methods: (1) We introduce a lightweight attention module to discover the relationships between features, and we fine-tune the pre-trained model (i.e., ResNet-50) on the HAM10000 dataset to extract the hidden high-level features from the images; (2) we integrate these high-level features with generic statistical features, and use the SMOTE oversampling technique to augment samples from the minority classes; and (3) we input the augmented samples into the XGBoost model for training and predicting. Results: The experimental results show that the accuracy, sensitivity, and specificity of the proposed SkinDet (Skin cancer detector based on transfer learning and feature fusion) model reached 98.24%, 97.84%, and 98.13%. The proposed model has stronger classification capability for the minority classes, such as dermato fibroma and actinic keratoses. Conclusion: SkinDet contains a lightweight attention module and can extract the hidden high-level features of the images by fine-tuning the pretrained model on the skin cancer dataset. In particular, SkinDet integrates high-level features with statistical features and augments samples of these minority classes. Importantly, SkinDet can be applied to classify the samples into minority classes.
-
-
-
ADSVAE: An Adaptive Density-aware Spectral Clustering Method for Multi-omics Data Based on Variational Autoencoder
Authors: Jianping Zhao, Qi Guan, Chunhou Zheng and Qingqing CaoIntroduction: The discovery of tumor subtypes helps to explore tumor pathogenesis, determine the operability of clinical treatment, and improve patient survival. Clustering analysis is increasingly applied to multi-genomic data. However, due to the diversity and complexity of multi-omics data, developing a complete clustering algorithm for tumor molecular typing is still challenging. Methods: In this study, we present an adaptive density-aware spectral clustering method based on a variational autoencoder (ADSVAE). ADSVAE first learns the underlying spatial information of each omics data using a variational autoencoder (VAE) based on the Wasserstein distance metric. Secondly, a similarity matrix is built for each gene set using an adaptive density-aware kernel. Thirdly, tensor product graphs (TPGs) are used to merge different data sources and reduce noise. Finally, ADSVAE employs a spectral clustering algorithm and utilizes the Gaussian mixture model (GMM) to cluster the final eigenvector matrix to identify cancer subtypes. Results: We tested ADSVAE on 5 TCGA datasets, all with good performance in comparison with several advanced multi-omics clustering algorithms. Compared with the existing multi-group clustering algorithms, the variational autoencoder based on the Wasserstein distance measure in the ADSVAE algorithm can learn the underlying spatial information on each omics data, which has a better effect on learning complex data distribution. The self-tuning density-aware kernel used by the ADSVAE algorithm enhances the similarity between shared near neighbor points, and the process of tensor product plot data integration and diffusion can better reduce the noise and reveal the underlying structure, improving the performance. Conclusion: Due to the inherent pitfalls of computational biology in the study of cancer subtype identification, although some research conclusions have been made in this paper on the related issues, as the research in related fields continues to deepen, the clustering study of cancer subtype identification based on genomic data needs further improvement and refinement.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
