Skip to content
2000
Volume 18, Issue 7
  • ISSN: 2352-0965
  • E-ISSN: 2352-0973

Abstract

Background

Diabetes has been rising in recent years and prior research has demonstrated Machine Learning Techniques (MLTs) to be useful tools for predicting diabetes. This research has examined the accuracy of six different MLTs for predicting diabetes using lifestyle data gathered from UCI (University of California). To improve medical outcomes and prevent its onset, the prediction of diabetes is necessary. This research has proposed a new framework based on the early detection of diabetes using lifestyle factors. Various MLTs, such as Logistic Regression (LR), Decision Tree Classification (DTC), Random Forest Classification (RFC), Support Vector Classification (SVC), and K-Nearest Classification (KNC) have been used for tenfold cross-validation and the results obtained from different techniques have been verified. Among all classification techniques, LR has achieved the highest accuracy of 93%, the precision of 92%, the recall score of 94%, the F1 score of 93%, and the weighted average of 90%, respectively. The proposed framework is utilized by the healthcare sector to predict diabetes early. It can also be used with datasets from various sectors that share diabetes-related data.

Methods

In this paper, we have used the proposed framework to predict diabetes mellitus in the healthcare system, diagnose various ailments, and assess if MLA performs well. The proposed system has been developed based on the MLT for the classification of DM. An intelligent framework for Diabetes Mellitus (DM) that has been developed using MLT illustrates the full workflow from data input to output. The five algorithms, Logistic Regression (LR), Decision Tree Classification (DTC), Random Forest Classification (RFC), Support Vector Classification (SVC), and K-Nearest Classification (KNC), have been compared in terms of accuracy, precision, recall, and F1 score.

Results

Results from the experimental setting using MLTs for DM prediction based on lifestyle predictors have been obtained. Descriptive statistics of lifestyle characteristics have been displayed along with their corresponding metrics, such as mean, standard deviation, minimum, maximum, . For instance, the age parameters’ mean, standard, and minimum at 25%, 50%, 75%, and maximum values were as follows: 520.0, 48.02, 12.151, 16.0, 39.0, 47.5, 57.0, and 90.0 respectively. Feature engineering is crucial to the process of constructing MLT. Insignificant or incorrect characteristics may have a negative impact on the way a model runs. The training time is drastically reduced and accuracy is increased with careful feature selection. In machine learning frameworks, some feature selection strategies include embedding, filter, wrapper, embedded, and hybrid techniques. An alarming number of people around the world suffer from the chronic and dangerous disease of diabetes. Using MLT, early DM prediction-based biological variables have been obtained in this research work. Data on patients’ lifestyles have been thoroughly examined in order to create a framework. The Canonical-correlation Analysis (CCA) has been used to select the ideal combination of lifestyle features. Finally, 10-fold cross-validations have been used to apply five alternative machine learning techniques for the prediction of disease.

Conclusion

To our knowledge, it is the first time a framework has been proposed that has yielded prediction results so much better than those from earlier research. The results obtained in this suggested work have been found accurate and reliable by metrics evaluation.

Loading

Article metrics loading...

/content/journals/raeeng/10.2174/0123520965291435240508111712
2024-07-04
2025-09-03
Loading full text...

Full text loading...

References

  1. LaaksoMarkku SilvaLilian Fernandes Genetics of type 2 diabetes: Past, present, and future.Nutrients 20221415320110.3390/nu14153201
    [Google Scholar]
  2. BellouVanesa Risk factors for type 2 diabetes mellitus: An exposure-wide umbrella review of meta-analyses.PloS one 2018133e019412710.1371/journal.pone.0194127
    [Google Scholar]
  3. BaliunasD.O. TaylorB.J. IrvingH. RoereckeM. PatraJ. MohapatraS. RehmJ. Alcohol as a risk factor for type 2 diabetes: A systematic review and meta-analysis.Diabetes Care200932112123213210.2337/dc09‑022719875607
    [Google Scholar]
  4. BhatS. S. BanuM. AnsariG. A. SelvamV. A risk assessment and prediction framework for diabetes mellitus using machine learning algorithms.Healthcare Analytics2023100273
    [Google Scholar]
  5. Salinas-RocaBlanca Rubió-PiquéLaura Montull-LópezAnna Polyphenol intake in pregnant women on gestational diabetes risk and neurodevelopmental disorders in offspring: A systematic review.Nutrients20221418375310.3390/nu14183753
    [Google Scholar]
  6. OwliaF. ZarezadehF. JambarsangS. KazemipoorM. Comparison of the response to pulpal sensibility tests in well-controlled and uncontrolled type ii diabetes mellitus patients: A cross-sectional study.Int. J. Dent.202220221710.1155/2022/619707036148044
    [Google Scholar]
  7. SaeediP. PetersohnI. SalpeaP. MalandaB. KarurangaS. UnwinN. ColagiuriS. GuariguataL. MotalaA.A. OgurtsovaK. ShawJ.E. BrightD. WilliamsR. IDF Diabetes Atlas Committee Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the international diabetes federation diabetes atlas, 9th edition.Diabetes Res. Clin. Pract.201915710784310.1016/j.diabres.2019.10784331518657
    [Google Scholar]
  8. BhatS.S. SelvamV. AnsariG.A. AnsariM.D. RahmanM.H. Prevalence and early prediction of diabetes using machine learning in North Kashmir: A case study of district bandipora.Comput. Intell. Neurosci.2022202211210.1155/2022/278976036238678
    [Google Scholar]
  9. KaurA. KumarY. A multi-objective vibrating particle system algorithm for data clustering.Pattern Anal. Appl.202225120923910.1007/s10044‑021‑01052‑1
    [Google Scholar]
  10. NhuViet-Ha Shallow landslide susceptibility mapping: A comparison between logistic model tree, logistic regression, naïve bayes tree, artificial neural network, and support vector machine algorithms.IJERPH2020178274910.3390/ijerph17082749
    [Google Scholar]
  11. Howsalya DeviR.D. BaiA. NagarajanN. A novel hybrid approach for diagnosing diabetes mellitus using farthest first and support vector machine algorithms.Obes. Med.20201710015210.1016/j.obmed.2019.100152
    [Google Scholar]
  12. ChakrabortyC. KishorA. RodriguesJ.J.P.C. Novel enhanced-grey wolf optimization hybrid machine learning technique for biomedical data computation.Comput. Electr. Eng.20229910777810.1016/j.compeleceng.2022.107778
    [Google Scholar]
  13. Sankar GaneshP.V. SripriyaP. A comparative review of prediction methods for pima indians diabetes dataset.International Conference On Computational Vision and Bio Inspired Computing, Springer, Cham, 2020. pp 735–750.10.1007/978‑3‑030‑37218‑7_83
    [Google Scholar]
  14. ThakurS.S. PoddarP. RoyR.B. Real-time prediction of smoking activity using machine learning based multi-class classification model.Multimedia Tools Appl.20228110145291455110.1007/s11042‑022‑12349‑635233178
    [Google Scholar]
  15. ChathuranikaI. KhaniyaB. NeupaneK. RustamjonovichK.M. RathnayakeU. Implementation of water-saving agro-technologies and irrigation methods in agriculture of Uzbekistan on a large scale as an urgent issue.Sustain. Water Resour. Manag.20228515510.1007/s40899‑022‑00746‑6
    [Google Scholar]
  16. SheraziSyed Waseem Abbas BaeJang-Whan LeeJong Yun A soft voting ensemble classifier for early prediction and diagnosis of occurrences of major adverse cardiovascular events for STEMI and NSTEMI during 2-year follow-up in patients with acute coronary syndrome.PLoS One2021166e024933810.1371/journal.pone.0249338
    [Google Scholar]
  17. RaschkaSebastian PattersonJoshua NoletCorey Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence.Information202011419310.3390/info11040193
    [Google Scholar]
  18. ShafiS. AnsariG.A. Early prediction of diabetes disease & classification of algorithms using machine learning approach.Proceedings of the International Conference on Smart Data Intelligence (ICSMDI 2021), 30 Jun 2021.10.2139/ssrn.3852590
    [Google Scholar]
  19. LiGuoyi A heterogeneous propagation graph model for rumor detection under the relationship among multiple propagation subtrees.Machine Learning and Knowledge Discovery in Databases.Springer, Cham.20221371420722310.1007/978‑3‑031‑26390‑3_13
    [Google Scholar]
  20. KhaniyaB. GunathilakeM.B. RathnayakeU. Ecosystem-based adaptation for the impact of climate change and variation in the water management sector of Sri Lanka.Math. Probl. Eng.2021202111010.1155/2021/8821329
    [Google Scholar]
  21. ShravyaCh PravalikaK. Prediction of breast cancer using supervised machine learning techniques.Int. J. Innov. Technol. Explor. Eng.20198611061110
    [Google Scholar]
  22. KumarMantosh NamrataKumari KumariNeha Hyper‐parametric improved machine learning models for solar radiation forecasting.Concurrency and Computation: Practice and Experience20223423e719010.1002/cpe.7190
    [Google Scholar]
  23. XieJ. SageM. ZhaoY.F. Feature selection and feature learning in machine learning applications for gas turbines: A review.Eng. Appl. Artif. Intell.202311710559110.1016/j.engappai.2022.105591
    [Google Scholar]
/content/journals/raeeng/10.2174/0123520965291435240508111712
Loading
/content/journals/raeeng/10.2174/0123520965291435240508111712
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test