Skip to content
2000
image of DecoyFinderNetAna: Application of Graph Convolution Neural Networks for Accurate Classification of True Small Molecule Binders from their Decoys

Abstract

Introduction

The drug discovery pipeline faces significant challenges, often requiring extensive screening to differentiate between true ligands and decoys. Computational biology techniques, particularly those involving deep learning, have increasingly been employed to address these issues. This study introduces and evaluates DecoyFinderNetAna, a Graph Convolutional Neural Network (GCNN)-based approach designed to distinguish true ligands from decoys in compound libraries, aiming to improve early-stage virtual screening accuracy.

Methods

Chemical compounds were represented as molecular graphs with associated mathematical and chemical features. These features served as input to a GCNN classifier trained to predict active ligands decoys. The model was trained on 85 protein targets sourced from the DUD-E database. Additionally, a case study involving Mycobacterium tuberculosis Thymidylate kinase was conducted using DecoyFinderNetAna, followed by validation molecular docking and Molecular Dynamics (MD) simulations.

Results

The model achieved an average sensitivity of 0.973, specificity of 0.993, and an average area under the curve (AUC) of 0.983 across 102 protein targets. Precision and recall metrics were also highly promising. In the case study, the predicted true binder exhibited a significantly higher binding affinity (ΔG = –28.01 kcal/mol) than the decoy (a difference of 9.5 kcal/mol), validating the model’s predictive accuracy.

Discussion

The results emphasize the power of Machine Learning (ML), particularly GCNNs, to enhance early-stage virtual screening by rapidly and accurately filtering out decoys, reducing computational costs, and prioritizing compounds for experimental validation. DecoyFinderNetAna thus represents a scalable and time-efficient alternative to conventional physics-based methods, with potential to significantly accelerate drug discovery workflows.

Conclusion

DecoyFinderNetAna demonstrates strong potential as a reliable early-stage screening tool in the drug discovery pipeline. By accurately eliminating decoys prior to docking, it streamlines the workflow and enhances the precision of downstream validation processes.

Loading

Article metrics loading...

/content/journals/cad/10.2174/0115734099429333260113070143
2026-02-23
2026-03-05
Loading full text...

Full text loading...

References

  1. Klebe G. Protein-ligand interactions as the basis for drug action. Proceedings of the Multifaceted Roles of Crystallography in Modern Drug Discovery 2015 83 92 10.1007/978‑94‑017‑9719‑1_7
    [Google Scholar]
  2. Macalino S.J.Y. Gosu V. Hong S. Choi S. Role of computer-aided drug design in modern drug discovery. Arch. Pharm. Res. 2015 38 9 1686 1701 10.1007/s12272‑015‑0640‑5 26208641
    [Google Scholar]
  3. Bajorath J. Computer-aided drug discovery. F1000Res 2015 4 F1000 Faculty Rev-630. 10.12688/f1000research.6653.1 26949519
    [Google Scholar]
  4. Soori M. Arezoo B. Dastres R. Artificial intelligence, machine learning and deep learning in advanced robotics, a review. Cognitive Robotics 2023 3 54 70 10.1016/j.cogr.2023.04.001
    [Google Scholar]
  5. Jess R. Ling T. Xiong Y. Wright C.J. Zhao F. Mechanical environment for in vitro cartilage tissue engineering assisted by in silico models. Biomater. Transl. 2023 4 1 18 26 10.12336/biomatertransl.2023.01.004 37206302
    [Google Scholar]
  6. Shi Y. Hu H. AI accelerated discovery of self-assembling peptides. Biomater. Transl. 2023 4 4 291 293 10.12336/biomatertransl.2023.04.008 38282703
    [Google Scholar]
  7. Campbell C. Machine learning methodology in bioinformatics. Springer Handbook of Bio-/Neuroinformatics. Kasabov N. Berlin, Heidelberg Springer 2014 185 206 10.1007/978‑3‑642‑30574‑0_12
    [Google Scholar]
  8. Jordan M.I. Mitchell, TM Machine learning: Trends, perspectives, and prospects. Science 2015 349 255 260 10.1126/science.aaa8415
    [Google Scholar]
  9. Rifaioglu Ahmet Sureyya Atas Heval Martin Maria Jesus Cetin-Atalay Rengul Atalay Volkan; Doğan, Tunca Recent applications of deep learning and machine intelligence on in silico drug discovery: Methods, tools and databases. Brief. Bioinform. 2019 20 5 1878 1912 10.1093/bib/bby061
    [Google Scholar]
  10. Lim Y.W. Adler A.S. Johnson D.S. Predicting antibody binders and generating synthetic antibodies using deep learning. MAbs 2022 14 1 2069075 2069075 10.1080/19420862.2022.2069075 35482911
    [Google Scholar]
  11. Chen M. Wei Z. Huang Z. Ding B. Li Y. Simple and deep graph convolutional networks. Presented at the International Conference on Machine Learning (ICML) 2020 2020 10.48550/arXiv.2007.02133
    [Google Scholar]
  12. O’Shea K. Nash R. An introduction to convolutional neural networks. arXiv:151108458 2015 10.48550/arXiv.1511.08458
    [Google Scholar]
  13. Barak O. Recurrent neural networks as versatile tools of neuroscience research. Curr. Opin. Neurobiol. 2017 46 1 6 10.1016/j.conb.2017.06.003 28668365
    [Google Scholar]
  14. Ruiz L. Gama F. Ribeiro, A Graph neural networks: Architectures, stability, and transferability. Proc. IEEE 2021 109 5 660 682 10.1109/JPROC.2021.3055400
    [Google Scholar]
  15. Goodfellow I. Pouget-Abadie J. Mirza M. Xu B. Warde-Farley D. Ozair S. Courville A. Bengio, Yoshua Generative adversarial networks. Commun. ACM 2020 63 11 139 144 10.1145/3422622
    [Google Scholar]
  16. Réau M. Renaud N. Xue L.C. Bonvin A.M.J.J. DeepRank-GNN: A graph neural network framework to learn patterns in protein–protein interfaces. Bioinformatics 2023 39 1 btac759 10.1093/bioinformatics/btac759 36420989
    [Google Scholar]
  17. Wang Y. Li Z. Barati Farimani A. Graph neural networks for molecules. Machine Learning in Molecular Sciences. Qu C. Liu H. Cham Springer International Publishing 2023 21 66 10.1007/978‑3‑031‑37196‑7_2
    [Google Scholar]
  18. Wu Y. Gao M. Zeng M. Zhang J. Li M. BridgeDPI: A novel Graph Neural Network for predicting drug–protein interactions. Bioinformatics 2022 38 9 2571 2578 10.1093/bioinformatics/btac155 35274672
    [Google Scholar]
  19. Wang X. Flannery S.T. Kihara D. Protein docking model evaluation by graph neural networks. Front. Mol. Biosci. 2021 8 647915 10.3389/fmolb.2021.647915 34113650
    [Google Scholar]
  20. Perera R. Agrawal V. A generalized machine learning framework for brittle crack problems using transfer learning and graph neural networks. Mech. Mater. 2023 181 104639 10.1016/j.mechmat.2023.104639
    [Google Scholar]
  21. Meli R. Morris G.M. Biggin P.C. Scoring functions for protein-ligand binding affinity prediction using structure-based deep learning: A review. Front. Bioinform. 2022 2 885983 10.3389/fbinf.2022.885983 36187180
    [Google Scholar]
  22. Degiacomi M.T. Coupling molecular dynamics and deep learning to mine protein conformational space. Structure 2019 27 6 1034 1040.e3 10.1016/j.str.2019.03.018 31031199
    [Google Scholar]
  23. Gainza P. Sverrisson F. Monti F. Rodolà E. Boscaini D. Bronstein M.M. Correia B.E. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 2020 17 2 184 192 10.1038/s41592‑019‑0666‑6 31819266
    [Google Scholar]
  24. Akhter N. Chennupati G. Djidjev H. Shehu A. Improved decoy selection via machine learning and ranking. Proceedings of the 2018 IEEE 8th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) 2018 1 1 10.1109/ICCABS.2018.8542075
    [Google Scholar]
  25. Wang Xiao Terashi Genki Christoffer Charles W; Zhu, Zhu Mengmeng; Kihara, Daisuke Protein docking model evaluation by 3D deep convolutional neural networks. Bioinformatics 2020 36 7 2113 2118 10.1093/bioinformatics/btz870 31746961
    [Google Scholar]
  26. Knutson C. Bontha M. Bilbrey J.A. Kumar N. Decoding the protein–ligand interactions using parallel graph neural networks. Sci. Rep. 2022 12 1 7624 10.1038/s41598‑022‑10418‑2 35538084
    [Google Scholar]
  27. Zeng X. Feng P-K. Li S-J. Lv S-Q. Wen M-L. Li Y. GNN-DDAS: Drug discovery for identifying anti-schistosome small molecules based on graph neural network. J. Comput. Chem. 2024 45 32 2825 2834 10.1002/jcc.27490 39189298
    [Google Scholar]
  28. Mysinger M.M. Carchia M. Irwin J.J. Shoichet B.K. Directory of useful decoys, enhanced (DUD-E): Better ligands and decoys for better benchmarking. J. Med. Chem. 2012 55 14 6582 6594 10.1021/jm300687e 22716043
    [Google Scholar]
  29. Stellar graph. 2024 Available from:https://www.stellargraph.io
  30. Prakash Kolla Bhanu Kanagachidambaresan GR Programming with TensorFlow: Solution for Edge Computing Applications Springer Cham Springer 2021 10.1007/978‑3‑030‑57077‑4
    [Google Scholar]
  31. Manaswi N.K. Understanding and working with keras. Deep Learning with Applications Using Python: Chatbots and Face, Object, and Speech. Recognition With TensorFlow and Keras. Manaswi N.K. Berkeley, CA Apress 2018 31 43 10.1007/978‑1‑4842‑3516‑4_2
    [Google Scholar]
  32. Morris G.M. Goodsell D.S. Halliday R.S. Huey R. Hart W.E. Belew R.K. Olson A.J. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem. 1998 19 14 1639 1662 10.1002/(SICI)1096‑987X(19981115)19:14<1639:AID‑JCC10>3.0.CO;2‑B
    [Google Scholar]
  33. O’Boyle N.M. Banck M. James C.A. Morley C. Vandermeersch T. Hutchison G.R. Open Babel: An open chemical toolbox. J. Cheminform. 2011 3 1 33 10.1186/1758‑2946‑3‑33 21982300
    [Google Scholar]
  34. Trott O. Olson A.J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010 31 2 455 461 10.1002/jcc.21334 19499576
    [Google Scholar]
  35. Fan J. Fu A. Zhang L. Progress in molecular docking. Quant. Biol. 2019 7 2 83 89 10.1007/s40484‑019‑0172‑y
    [Google Scholar]
  36. Bauer P. Hess B. Lindahl E. GROMACS 2022.4 Manual. 2022 Available from: https://zenodo.org/records/7323409
  37. Huang J. MacKerell A.D. CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data. J. Comput. Chem. 2013 34 25 2135 2145 10.1002/jcc.23354 23832629
    [Google Scholar]
  38. Vanommeslaeghe K. Hatcher E. Acharya C. Kundu S. Zhong S. Shim J. Darian E. Guvench O. Lopes P. Vorobyov I. Mackerell A.D. CHARMM general force field: A force field for drug‐like molecules compatible with the CHARMM all‐atom additive biological force fields. J. Comput. Chem. 2010 31 4 671 690 10.1002/jcc.21367 19575467
    [Google Scholar]
  39. Darden Tom York Darrin Pedersen, Lee Particle mesh ewald: An N⋅log(N) method for ewald sums in large systems. J. Chem. Phys. 1993 98 10089 10092 10.1063/1.464397 14540447
    [Google Scholar]
  40. Harrach M.F. Drossel B. Structure and dynamics of TIP3P, TIP4P, and TIP5P water near smooth and atomistic walls of different hydroaffinity. J. Chem. Phys. 2014 140 17 174501 10.1063/1.4872239 24811640
    [Google Scholar]
  41. Hess B. Bekker H. Berendsen H.J.C. Fraaije J.G.E.M. LINCS: A linear constraint solver for molecular simulations. J. Comput. Chem. 1997 18 12 1463 1472 10.1002/(SICI)1096‑987X(199709)18:12<1463:AID‑JCC4>3.0.CO;2‑H
    [Google Scholar]
  42. Shahab M. Danial M. Duan X. Khan T. Liang C. Gao H. Chen M. Wang D. Zheng G. Machine learning-based drug design for identification of thymidylate kinase inhibitors as a potential anti-Mycobacterium tuberculosis. J. Biomol. Struct. Dyn. 2024 42 8 3874 3886 10.1080/07391102.2023.2216278 37232453
    [Google Scholar]
  43. Liu T. Lin Y. Wen X. Jorissen R.N. Gilson M.K. Binding D.B. BindingDB: A web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res. 2007 35 Database D198 D201 10.1093/nar/gkl999 17145705
    [Google Scholar]
  44. Choi J.Y. Plummer M.S. Starr J. Desbonnet C.R. Soutter H. Chang J. Miller J.R. Dillman K. Miller A.A. Roush W.R. Structure guided development of novel thymidine mimetics targeting Pseudomonas aeruginosa thymidylate kinase: From hit to lead generation. J. Med. Chem. 2012 55 2 852 870 10.1021/jm201349f 22243413
    [Google Scholar]
  45. Cereto-Massagué A. Guasch L. Valls C. Mulero M. Pujadas G. Garcia-Vallvé S. DecoyFinder: An easy-to-use python GUI application for building target-specific decoy sets. Bioinformatics 2012 28 12 1661 1662 10.1093/bioinformatics/bts249 22539671
    [Google Scholar]
  46. Melville J. Burke E. Hirst J. Machine learning in virtual screening. Comb. Chem. High Throughput Screen. 2009 12 4 332 343 10.2174/138620709788167980 19442063
    [Google Scholar]
  47. Svetnik V. Liaw A. Tong C. Culberson J.C. Sheridan R.P. Feuston B.P. Random forest: A classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 2003 43 6 1947 1958 10.1021/ci034160g 14632445
    [Google Scholar]
  48. Bradley A.P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997 30 7 1145 1159 10.1016/S0031‑3203(96)00142‑2
    [Google Scholar]
  49. Khan A. Adil S. Qudsia H.A. Waheed Y. Alshabrmi F.M. Wei D.Q. Structure-based design of promising natural products to inhibit thymidylate kinase from Monkeypox virus and validation using free energy calculations. Comput. Biol. Med. 2023 158 106797 10.1016/j.compbiomed.2023.106797 36966556
    [Google Scholar]
/content/journals/cad/10.2174/0115734099429333260113070143
Loading
/content/journals/cad/10.2174/0115734099429333260113070143
Loading

Data & Media loading...

Supplements

Supplementary material is available on the publisher's website along with the published article.

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test