Skip to content
2000
Volume 16, Issue 3
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background: With the rapid development of the sequencing methods in recent years, binding sites have been systematically identified in such projects as Nested-MICA and MEME. Prediction of DNA motifs with higher accuracy and precision has been a very important task for bioinformaticians. Nevertheless, experimental approaches are still time-consuming for big data set, making computational identification of binding sites indispensable. Objective: To facilitate the identification of the binding site, we proposed a deep learning architecture, named Deep-BSC (Deep-Learning Binary Search Classification), to predict binding sites in a raw DNA sequence with more precision and accuracy. Methods: Our proposed architecture purely relies on the raw DNA sequence to predict the binding sites for protein by using a convolutional neural network (CNN). We trained our deep learning model on binding sites at the nucleotide level. DNA sequence of A. thaliana is used in this study because it is a model plant. Results: The results demonstrate the effectiveness and efficiency of our method in the classification of binding sites against random sequences, using deep learning. We construct a CNN with different layers and filters to show the usefulness of max-pooling technique in the proposed method. To gain the interpretability of our approach, we further visualized binding sites in the saliency map and successfully identified similar motifs in the raw sequence. The proposed computational framework is time and resource efficient. Conclusion: Deep-BSC enables the identification of binding sites in the DNA sequences via a highly accurate CNN. The proposed computational framework can also be applied to problems such as operator, repeats in the genome, DNA markers, and recognition sites for enzymes, thereby promoting the use of Deep-BSC method in life sciences.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893615999200707142852
2021-03-01
2025-10-25
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893615999200707142852
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test