Full text loading...
The drug discovery pipeline faces significant challenges, often requiring extensive screening to differentiate between true ligands and decoys. Computational biology techniques, particularly those involving deep learning, have increasingly been employed to address these issues. This study introduces and evaluates DecoyFinderNetAna, a Graph Convolutional Neural Network (GCNN)-based approach designed to distinguish true ligands from decoys in compound libraries, aiming to improve early-stage virtual screening accuracy.
Chemical compounds were represented as molecular graphs with associated mathematical and chemical features. These features served as input to a GCNN classifier trained to predict active ligands versus decoys. The model was trained on 85 protein targets sourced from the DUD-E database. Additionally, a case study involving Mycobacterium tuberculosis Thymidylate kinase was conducted using DecoyFinderNetAna, followed by validation via molecular docking and Molecular Dynamics (MD) simulations.
The model achieved an average sensitivity of 0.973, specificity of 0.993, and an average area under the curve (AUC) of 0.983 across 102 protein targets. Precision and recall metrics were also highly promising. In the case study, the predicted true binder exhibited a significantly higher binding affinity (ΔG = –28.01 kcal/mol) than the decoy (a difference of 9.5 kcal/mol), validating the model’s predictive accuracy.
The results emphasize the power of Machine Learning (ML), particularly GCNNs, to enhance early-stage virtual screening by rapidly and accurately filtering out decoys, reducing computational costs, and prioritizing compounds for experimental validation. DecoyFinderNetAna thus represents a scalable and time-efficient alternative to conventional physics-based methods, with potential to significantly accelerate drug discovery workflows.
DecoyFinderNetAna demonstrates strong potential as a reliable early-stage screening tool in the drug discovery pipeline. By accurately eliminating decoys prior to docking, it streamlines the workflow and enhances the precision of downstream in silico validation processes.