- Home
- A-Z Publications
- Current Bioinformatics
- Previous Issues
- Volume 17, Issue 2, 2022
Current Bioinformatics - Volume 17, Issue 2, 2022
Volume 17, Issue 2, 2022
-
-
CRISPR/ Cas9 Off-targets: Computational Analysis of Causes, Prediction, Detection, and Overcoming Strategies
Authors: Roshan K. Roy, Ipsita Debashree, Sonal Srivastava, Narayan Rishi and Ashish SrivastavaCRISPR/Cas9 technology is a highly flexible RNA-guided endonuclease (RGEN) based gene-editing tool that has transformed the field of genomics, gene therapy, and genome/ epigenome imaging. Its wide range of applications provides immense scope for understanding as well as manipulating genetic/epigenetic elements. However, the RGEN is prone to off-target mutagenesis that leads to deleterious effects. This review details the molecular and cellular mechanisms underlying the off-target activity, various available detection tools and prediction methodology ranging from sequencing to machine learning approaches, and the strategies to overcome/minimise off-targets. A coherent and concise method increasing target precision would prove indispensable to concrete manipulation and interpretation of genome editing results that can revolutionise therapeutics, including clarity in genome regulatory mechanisms during development.
-
-
-
A Novel Feature Selection Method Based on MRMR and Enhanced Flower Pollination Algorithm for High Dimensional Biomedical Data
Authors: Chaokun Yan, Mengyuan Li, Jingjing Ma, Yi Liao, Huimin Luo, Jianlin Wang and Junwei LuoBackground: The massive amount of biomedical data accumulated in the past decades can be utilized for diagnosing disease. Objective: However, the high dimensionality, small sample sizes, and irrelevant features of data often have a negative influence on the accuracy and speed of disease prediction. Some existing machine learning models cannot capture the patterns on these datasets accurately without utilizing feature selection. Methods: Filter and wrapper are two prevailing feature selection methods. The filter method is fast but has low prediction accuracy, while the latter can obtain high accuracy but has a formidable computation cost. Given the drawbacks of using filter or wrapper individually, a novel feature selection method, called MRMR-EFPATS, is proposed, which hybridizes filter method Minimum Redundancy Maximum Relevance (MRMR) and wrapper method based on an improved Flower Pollination Algorithm (FPA). First, MRMR is employed to rank and screen out some important features quickly. These features are further chosen for individual populations following the wrapper method for faster convergence and less computational time. Then, due to its efficiency and flexibility, FPA is adopted to further discover an optimal feature subset. Results: FPA still has some drawbacks, such as slow convergence rate, inadequacy in terms of searching new solutions, and tends to be trapped in local optima. In our work, an elite strategy is adopted to improve the convergence speed of the FPA. Tabu search and Adaptive Gaussian Mutation are employed to improve the search capability of FPA and escape from local optima. Here, the KNN classifier with the 5-fold-CV is utilized to evaluate the classification accuracy. Conclusion: Extensive experimental results on six public high dimensional biomedical datasets show that the proposed MRMR-EFPATS has achieved superior performance compared to other state-of-theart methods.
-
-
-
Heterogeneous Gene Expression Cross-Evaluation of Robust Biomarkers Using Machine Learning Techniques Applied to Lung Cancer
Background: Nowadays, gene expression analysis is one of the most promising pillars for understanding and uncovering the mechanisms underlying the development and spread of cancer. In this sense, Next Generation Sequencing technologies, such as RNA-Seq, are currently leading the market due to their precision and cost. Nevertheless, there is still an enormous amount of non-analyzed data obtained from older technologies, such as Microarray, which could still be useful to extract relevant knowledge. Methods: Throughout this research, a complete machine learning methodology to cross-evaluate the compatibility between both RNA-Seq and Microarray sequencing technologies is described and implemented. In order to show a real application of the designed pipeline, a lung cancer case study is addressed by considering two detected subtypes: adenocarcinoma and squamous cell carcinoma. Transcriptomic datasets considered for our study have been obtained from the public repositories NCBI/GEO, ArrayExpress and GDC-Portal. From them, several gene experiments have been carried out with the aim of finding gene signatures for these lung cancer subtypes, linked to both transcriptomic technologies. With these DEGs selected, intelligent predictive models capable of classifying new samples belonging to these cancer subtypes have been developed. Results: The predictive models built using one technology are capable of discerning samples from a different technology. The classification results are evaluated in terms of accuracy, F1-score and ROC curves along with AUC. Finally, the biological information of the gene sets obtained and their relationship with lung cancer are reviewed, encountering strong biological evidence linking them to the disease. Conclusion: Our method has the capability of finding strong gene signatures which are also independent of the transcriptomic technology used to develop the analysis. In addition, our article highlights the potential of using heterogeneous transcriptomic data to increase the amount of samples for the studies, increasing the statistical significance of the results.
-
-
-
Cervical Cancer Metastasis and Recurrence Risk Prediction Based on Deep Convolutional Neural Network
Authors: Zixuan Ye, Yunxiang Zhang, Yuebin Liang, Jidong Lang, Xiaoli Zhang, Guoliang Zang, Dawei Yuan, Geng Tian, Mansheng Xiao and Jialiang YangBackground: Evaluating the risk of metastasis and recurrence of a cervical cancer patient is critical for appropriate adjuvant therapy. However, current risk assessment models usually involve the testing of tens to thousands of genes from patients’ tissue samples, which is expensive and timeconsuming. Therefore, computer-aided diagnosis and prognosis prediction based on Hematoxylin and Eosin (H) pathological images have received much attention recently. Objective: The prognosis of whether patients will have metastasis and recurrence can support accurate treatment for patients in advance and help reduce patient loss. It is also important for guiding treatment after surgery to be able to quickly and accurately predict the risk of metastasis and recurrence of a cervical cancer patient. Methods: To address this problem, we propose a hybrid method. Transfer learning is used to extract features, and it is combined with traditional machine learning in order to analyze and determine whether patients have the risks of metastasis and recurrence. First, the proposed model retrieved relevant patches using a color-based method from H pathological images, which were then subjected to image preprocessing steps such as image normalization and color homogenization. Based on the labeled patched images, the Xception model with good classification performance was selected, and deep features of patched pathological images were automatically extracted with transfer learning. After that, the extracted features were combined to train a random forest model to predict the label of a new patched image. Finally, a majority voting method was developed to predict the metastasis and recurrence risk of a patient based on the predictions of patched images from the whole-slide H image. Results: In our experiment, the proposed model yielded an area under the receiver operating characteristic curve of 0.82 for the whole-slide image. The experimental results showed that the high-level features extracted by the deep convolutional neural network from the whole-slide image can be used to predict the risk of recurrence and metastasis after surgical resection and help identify patients who might receive additional benefit from adjuvant therapy. Conclusion: This paper explored the feasibility of predicting the risk of metastasis and recurrence from cervical cancer whole slide H images through deep learning and random forest methods.
-
-
-
TP-MV: Therapeutic Peptides Prediction by Multi-view Learning
Authors: Ke Yan, Hongwu Lv, Jie Wen, Yichen Guo and Bin LiuBackground: Therapeutic peptide prediction is critical for drug development and therapy. Researchers have been studying this essential task, developing several computational methods to identify different therapeutic peptide types. Objective: Most predictors are the specific methods for certain peptides. Currently, developing methods to predict the presence of multiple peptides remains a challenging problem. Moreover, it is still challenging to combine different features to make the therapeutic prediction. Methods: In this paper, we proposed a new ensemble method TP-MV for general therapeutic peptide recognition. TP-MV is developed using the stacking framework in conjunction with the KNN, SVM, ET, RF, and XGB. Then TP-MV constructs a multi-view learning model as meta-classifiers to extract the discriminative feature for different peptides. Results: In the experiment, the proposed method outperforms the other existing methods on the benchmark datasets, indicating that the proposed method has the ability to predict multiple therapeutic peptides simultaneously. Conclusion: The TP-MV is a useful tool for predicting therapeutic peptides.
-
-
-
iAnt: Combination of Convolutional Neural Network and Random Forest Models Using PSSM and BERT Features to Identify Antioxidant Proteins
Authors: Hoang V. Tran and Quang H. NguyenBackground: Reactive Oxygen Species (ROS) play many roles in the body, such as cell signaling, homeostasis, or protection from harmful bacteria. However, an excess of ROS in the body will damage lipids, proteins, and DNA. Many studies have shown that various environmental factors increase the amount of ROS produced in the body. Antioxidant proteins are responsible for neutralizing these ROS or free radicals. Although the amount of data on protein sequences has increased over the last two decades, we still lack bioinformatics tools to be able to accurately identify antioxidant protein sequences. Furthermore, biochemical methods to determine antioxidant proteins are very expensive and time-consuming. Therefore, a machine learning approach must be used to speed up the computation. Methods: In this study, we propose a new method that combines a convolutional neural network and Random Forest using two features, the normalized PSSM and the best-selected feature of the ProtBert output. Results: Our model gave very good results on the independent test dataset with 97.3% sensitivity and 95.9% specificity. Comparison with current state-of-the-art models shows that our model is superior. We have also installed iAnt as an online website with a friendly interface available at a website: http: //antixiodant.nguyenhongquang.edu.vn. Conclusion: iAnt has been developed to accurately identify the antioxidant protein. It shows results outperforming the existing state-of-the-art methods; it is also available online.
-
-
-
Immune-related Gene-based Prognostic Signature for the Risk Stratification Analysis of Breast Cancer
Authors: Dongqing Su, Qianzi Lu, Yi Pan, Yao Yu, Shiyuan Wang, Yongchun Zuo and Lei YangBackground: Breast cancer has plagued women for many years and caused many deaths around the world. Methods: In this study, based on the weighted correlation network analysis, univariate Cox regression analysis, and least absolute shrinkage and selection operator, 12 immune-related genes were selected to construct the risk score for breast cancer patients. The multivariable Cox regression analysis, gene set enrichment analysis, and nomogram were also conducted in this study. Results: Good results were obtained in the survival analysis, enrichment analysis, multivariable Cox regression analysis and immune-related feature analysis. When the risk score model was applied in 22 breast cancer cohorts, the univariate Cox regression analysis demonstrated that the risk score model was significantly associated with overall survival in most of the breast cancer cohorts. Conclusion: Based on these results, we could conclude that the proposed risk score model may be a promising method and may improve the treatment stratification of breast cancer patients in the future work.
-
-
-
Identification of Key Histone Modifications and Hub Genes for Colorectal Cancer Metastasis
Authors: Yuan-Yuan Zhai, Qian-Zhong Li, Ying-Li Chen and Lu-Qiang ZhangBackground: Epithelial-Mesenchymal Transition (EMT) and its reverse Mesenchymal- Epithelial Transition (MET) are essential for tumor cells metastasis. However, the effect of epigenetic modifications on this transition is unclear. Objective: We aimed to explore the key histone modifications and hub genes of EMT/MET during Colorectal Cancer (CRC) metastasis. Methods: The differentially expressed genes and differentially histone modified genes were identified. Based on the histone modification features, the up- and down-regulated genes were predicted by Random Forest algorithm. Through protein-protein interaction network and Cytoscape analysis, the hub genes with histone modification changes were selected. GO, KEGG and survival analyses were performed to confirm the importance of the hub genes. Results: It was found that H3K79me3 plays an important role in EMT/MET. And the 200-300bp and 400-500bp downstream of TSS may be the key regulatory regions of H3K79me3. Moreover, we found that the expression of the hub genes was down-regulated in EMT and then up-regulated in MET. And the changes of the hub genes expression were consistent with the changes of H3K79me3 signal in the specific regions of the genome. Finally, the hub genes KRT8 and KRT18 were involved in the metastasis process and were significantly related to the survival time. Conclusion: H3K79me3 may be crucial for EMT/MET, and the hub genes KRT8 and KRT18 may be the key genes in this process.
-
Volumes & issues
-
Volume 20 (2025)
-
Volume 19 (2024)
-
Volume 18 (2023)
-
Volume 17 (2022)
-
Volume 16 (2021)
-
Volume 15 (2020)
-
Volume 14 (2019)
-
Volume 13 (2018)
-
Volume 12 (2017)
-
Volume 11 (2016)
-
Volume 10 (2015)
-
Volume 9 (2014)
-
Volume 8 (2013)
-
Volume 7 (2012)
-
Volume 6 (2011)
-
Volume 5 (2010)
-
Volume 4 (2009)
-
Volume 3 (2008)
-
Volume 2 (2007)
-
Volume 1 (2006)
Most Read This Month
