The Journal of Bucharest College of Physicians and the Romanian Academy of Medical Sciences

Prediction of Type 2 Diabetes Mellitus Using Soft Computing


Background: Type 2 Diabetes Mellitus (DM) is another pandemic of 21 century, and its control is of immense importance. Researchers developed many predictor models using soft computing techniques. The present study developed a prediction model for Type 2 DM using machine learning classifiers. The analysis excludes plasma glucose concentration and insulin concentration as predictors to explore relationships with other predictors.
Methods: This cross-sectional study enrolled 108 participants aged 25 to 67 years from SMS Medical College, Jaipur (Rajasthan, India), after approval from the ethics committee. The study developed a prediction model using machine learning techniques. The classifiers used in the application include decision trees, support vector machines, K-nearest neighbors, and ensemble learning classifiers. A total of 25 predictors were collected and underwent feature reduction. The response levels include diabetes mellitus, prediabetes, and no diabetes mellitus. The models were run using three predictors and a response variable. The prediction model with the best accuracy and area under the receiver operator characteristic curve was selected.
Results: The features that vary among the three groups include age, WHR, biceps skinfold thickness, total lipids, phospholipids, triglycerides, total cholesterol, LDL, VLDL, and serum creatinine, and family history of DM. After feature reduction, the age, biceps skinfold thickness, and serum creatinine were run on the Classification learner application to predict the diabetic category. The best model was subspace discriminant with accuracy, sensitivity, specificity, and AUC under the ROC curve was 62.4%, 74%, 94%, and 0.70, respectively. Conclusion: The present study concludes that age, biceps skinfold thickness, and serum creatinine combination have higher specificity in predicting type 2 DM. The study emphasized the selection of appropriate predictors along with newer machine learning algorithms.