Skip to content
2000
Volume 21, Issue 4
  • ISSN: 1573-4099
  • E-ISSN: 1875-6697

Abstract

Background

Virtual screening (VS) is essential for analyzing potential drug candidates in drug discovery. Often, this involves the conversion of large volumes of compound data into specific formats suitable for computational analysis. Managing and processing this wealth of information, especially when dealing with vast numbers of compounds in various forms, such as names, identifiers, or SMILES strings, can present significant logistical and technical challenges.

Methods

To streamline this process, we developed PyComp, a software tool using Python's PyQt5 library, and compiled it into an executable with Pyinstaller. PyComp provides a systematic way for users to retrieve and convert a list of compound names, IDs (even in a range), or SMILES strings into the desired 3D format.

Results

PyComp greatly enhances the efficiency of data extraction, conversion, and storage processes involved in VS. It searches for similar compounds coupled with its ability to handle misidentified compounds and offers users an easy-to-use, customizable tool for managing large-scale compound data. By streamlining these operations, PyComp allows researchers to save significant time and effort, thus accelerating the pace of drug discovery research.

Conclusion

PyComp effectively addresses some of the most pressing challenges in high-throughput VS: efficient management and conversion of large volumes of compound data. As a user-friendly, customizable software tool, PyComp is pivotal in improving the efficiency and success of large-scale drug screening efforts, paving the way for faster discovery of potential therapeutic compounds.

Loading

Article metrics loading...

/content/journals/cad/10.2174/0115734099274495231218150611
2024-01-08
2025-10-03
Loading full text...

Full text loading...

References

  1. FerreiraL. dos SantosR. OlivaG. AndricopuloA. Molecular docking and structure-based drug design strategies.Molecules2015207133841342110.3390/molecules20071338426205061
    [Google Scholar]
  2. TripathiA. MisraK. Molecular docking: A structure-based drug designing approach.JSM Chem.20175210421047
    [Google Scholar]
  3. MengX.Y. ZhangH.X. MezeiM. CuiM. Molecular docking: A powerful approach for structure-based drug discovery.Curr. Computeraided Drug Des.20117214615710.2174/15734091179567760221534921
    [Google Scholar]
  4. KimS. ThiessenP.A. BoltonE.E. ChenJ. FuG. GindulyteA. HanL. HeJ. HeS. ShoemakerB.A. WangJ. YuB. ZhangJ. BryantS.H. PubChem substance and compound databases.Nucleic Acids Res.201644D1D1202D121310.1093/nar/gkv95126400175
    [Google Scholar]
  5. KimS. ChenJ. ChengT. GindulyteA. HeJ. HeS. LiQ. ShoemakerB.A. ThiessenP.A. YuB. ZaslavskyL. ZhangJ. BoltonE.E. PubChem in 2021: New data content and improved web interfaces.Nucleic Acids Res.202149D1D1388D139510.1093/nar/gkaa97133151290
    [Google Scholar]
  6. XieX.Q.S. Exploiting PubChem for virtual screening.Expert Opin. Drug Discov.20105121205122010.1517/17460441.2010.52492421691435
    [Google Scholar]
  7. XieX.Q. ChenJ.Z. Data mining a small molecule drug screening representative subset from NIH PubChem.J. Chem. Inf. Model.200848346547510.1021/ci700193u18302356
    [Google Scholar]
  8. FontaineF. BoltonE. BorodinaY. BryantS.H. Fast 3D shape screening of large chemical databases through alignment-recycling.Chem. Cent. J.2007111210.1186/1752‑153X‑1‑1217880744
    [Google Scholar]
  9. GuhaR. Van DrieJ.H. Structure--activity landscape index: Identifying and quantifying activity cliffs.J. Chem. Inf. Model.200848364665810.1021/ci700409318303878
    [Google Scholar]
  10. KimberT.B. ChenY. VolkamerA. Deep learning in virtual screening: Recent applications and developments.Int. J. Mol. Sci.2021229443510.3390/ijms2209443533922714
    [Google Scholar]
  11. KorotcovA. TkachenkoV. RussoD.P. EkinsS. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets.Mol. Pharm.201714124462447510.1021/acs.molpharmaceut.7b0057829096442
    [Google Scholar]
  12. HussinS.K. AbdelmageidS.M. AlkhalilA. OmarY.M. MarieM.I. RamadanR.A. Handling imbalance classification virtual screening big data using machine learning algorithms.Complexity2021202111510.1155/2021/6675279
    [Google Scholar]
  13. AdeshinaY.O. DeedsE.J. KaranicolasJ. Machine learning classification can reduce false positives in structure-based virtual screening.Proc. Natl. Acad. Sci. USA202011731184771848810.1073/pnas.200058511732669436
    [Google Scholar]
  14. LongY. WuM. LiuY. FangY. KwohC.K. ChenJ. LuoJ. LiX. Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics20223882254226210.1093/bioinformatics/btac10035171981
    [Google Scholar]
  15. RathS. PandaS. SacchettiniJ.C. BerthelS.J. DAIKON: A data acquisition, integration, and knowledge capture web application for target-based drug discovery.ACS Pharmacol. Transl. Sci.2023671043105110.1021/acsptsci.3c0003437470023
    [Google Scholar]
  16. O’BoyleN.M. BanckM. JamesC.A. MorleyC. VandermeerschT. HutchisonG.R. Open Babel: An open chemical toolbox.J. Cheminform.2011313310.1186/1758‑2946‑3‑3321982300
    [Google Scholar]
  17. SisakhtM. MahmoodzadehA. DarabianM. Plant‐derived chemicals as potential inhibitors of SARS‐CoV ‐2 main protease ( 6LU7 ), a virtual screening study.Phytother. Res.20213563262327410.1002/ptr.704133759279
    [Google Scholar]
  18. MurphyA.H. The Finley affair: A signal event in the history of forecast verification.Weather Forecast.199611132010.1175/1520‑0434(1996)011<0003:TFAASE>2.0.CO;2
    [Google Scholar]
  19. BeierleJ. AlgorriM. CortésM. CauchonN.S. LennardA. KirwanJ.P. OghamianS. AbernathyM.J. Structured content and data management—enhancing acceleration in drug development through efficiency in data exchange.AAPS Open2023911110.1186/s41120‑023‑00077‑637193559
    [Google Scholar]
  20. TanoliZ. SeemabU. SchererA. WennerbergK. TangJ. Vähä-KoskelaM. Exploration of databases and methods supporting drug repurposing: A comprehensive survey.Brief. Bioinform.20212221656167810.1093/bib/bbaa00332055842
    [Google Scholar]
  21. WishartD.S. FeunangY.D. GuoA.C. LoE.J. MarcuA. GrantJ.R. SajedT. JohnsonD. LiC. SayeedaZ. AssempourN. IynkkaranI. LiuY. MaciejewskiA. GaleN. WilsonA. ChinL. CummingsR. LeD. PonA. KnoxC. WilsonM. DrugBank 5.0: A major update to the DrugBank database for 2018.Nucleic Acids Res.201846D1D1074D108210.1093/nar/gkx103729126136
    [Google Scholar]
  22. GaultonA. BellisL.J. BentoA.P. ChambersJ. DaviesM. HerseyA. LightY. McGlincheyS. MichalovichD. Al-LazikaniB. OveringtonJ.P. ChEMBL: a large-scale bioactivity database for drug discovery.Nucleic Acids Res.201240D1D1100D110710.1093/nar/gkr77721948594
    [Google Scholar]
  23. KanehisaM. The KEGG database. ‘In silico’ simulation of biological processes: Novartis Foundation SymposiumWiley Online Library2002
    [Google Scholar]
  24. IrwinJ.J. ShoichetB.K. ZINC--a free database of commercially available compounds for virtual screening.J. Chem. Inf. Model.200545117718210.1021/ci049714+15667143
    [Google Scholar]
  25. BrownN. CambruzziJ. CoxP.J. DaviesM. DunbarJ. PlumbleyD. SellwoodM.A. SimA. Williams-JonesB.I. ZwierzynaM. SheppardD.W. Big data in drug discovery.Prog. Med. Chem.201857127735610.1016/bs.pmch.2017.12.00329680150
    [Google Scholar]
/content/journals/cad/10.2174/0115734099274495231218150611
Loading
/content/journals/cad/10.2174/0115734099274495231218150611
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test