DOI: 10.3390/biomedicines13010124 ISSN: 2227-9059

Diabetes Prediction Through Linkage of Causal Discovery and Inference Model with Machine Learning Models

Mi Jin Noh, Yang Sok Kim

Background/Objectives: Diabetes is a dangerous disease that is accompanied by various complications, including cardiovascular disease. As the global diabetes population continues to increase, it is crucial to identify its causes. Therefore, we predicted diabetes using an AI model and quantitatively examined causal relationships using a causal discovery and inference model. Methods: Kaggle’s dataset from the National Institute of Diabetes and Digestive and Kidney Diseases was analyzed using logistic regression, deep learning, gradient boosting, and decision trees. Causal discovery techniques, such as LiNGAM, were employed to infer relationships between variables. Results: The study achieved high accuracy across models using logistic regression (84.84%) and deep learning (84.83%). The causal model highlighted factors such as physical activity, difficulty in walking, and heavy drinking as direct contributors to diabetes. Conclusions: By combining AI with causal inference, this study provides both predictive performance and insight into the factors affecting diabetes, paving the way for tailored interventions.

More from our Archive