Articles Information
Public Health and Preventive Medicine, Vol.1, No.3, Aug. 2015, Pub. Date: Jun. 2, 2015
Developing Statistical Diagnosis Model by Discovering Principal Parameters for Type 2 Diabetes Mellitus: A Case for Korea
Pages: 86-93 Views: 6448 Downloads: 1232
Authors
[01]
Jae Hyun Nam, Friendoctor Clinic, Seoul, Republic of Korea.
[02]
Jongseong Kim, Department of Systems Management Engineering, Suwon, Republic of Korea.
[03]
Hoo-Gon Choi, Department of Systems Management Engineering, Suwon, Republic of Korea.
Abstract
Objective: To determine the principal parameters for type 2 diabetes mellitus and develop a statistical diagnostic model to ensure more reliable diagnosis based on laboratory test results. Design: The use of fasting glucose levels as the only parameter is insufficient for making an accurate diagnosis of type 2 diabetes mellitus. Sample data were collected from a specialized diabetes mellitus clinic (Friendoctor Clinic®) located in Korea. Statistical analyses including the t-test were used to select the principal parameters, and a decision tree and clustering methods including expectation maximization were used to investigate the relationships among the principal parameters. Setting: This study was conducted at the Department of Industrial Engineering at Sungkyunkwan University, Suwon, Republic of Korea, and Friendoctor Clinic®, Seoul, Republic of Korea, between March 2010 and February 2011. Subjects: The total number of subjects was 953, including 692 patients and 261 non-patients (797 men, 156 women; age range, 19-81 years). Results: Among 32 laboratory test parameters, 10 statistically principal parameters were obtained. The entire subjects were divided into four groups on the basis of the obtained principal parameters: the patient group (PG), high-probability group (HG), low-probability group (LG), and normal group (NG). Although the fasting glucose level is important for the diagnosis of diabetes mellitus, six additional parameters such as age, GPT, A/G ratio, fasting glucose, MCHC and globulin were important for ensuring a more reliable diagnosis in the four groups. These results were confirmed by the classifier attribute selection method. Conclusion: A large number of laboratory test results were investigated comprehensively and intensively. Cases in patients belonging to each class (i.e., PG, HG, LG, or NG) can be diagnosed and treated differently on the basis of the principal parameters and diagnostic model used. However, more in-depth discussions about important risk factors such as high body mass index, genetic predisposition , lack of exercise, eating habits, pregnancy, weight changes, poor socioeconomic conditions, smoking habits, kinds of drugs, and sex hormone levels are required for the generalization of our results. This study’s findings will be a useful resource for diabetes research in Korea.
Keywords
Type 2 Diabetes Mellitus, Laboratory Test, Principal Parameters, Diagnosis Model, Critical Parameters
References
[01]
Kim SG, Choi DS. Epidemiology and current status of diabetes in Korea. Hanyang Medical Reviews 2009; 29:122-9.
[02]
Ryu J, Kim S, Park J, Lee J. Risk factors of impaired fasting glucose and type 2 diabetes mellitus - using data mining. Korean Journal of Epidemiology 2006; 28: 138-51.
[03]
ADA International Expert Committee. Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care, 1997; 20: 1183-97.
[04]
Kim JH, Han MA, Park CJ, Park IG, Shin JH, Kim SY, Ryu SY, Bae HY. Evaluation of Fasting Plasma Glucose as a Screening for Diabetes Mellitus in Middle-aged Adults of Naju County. Diabetes and Metabolism Journal 2008; 32: 328- 37.
[05]
Son HS. Early diagnosis of diabetes mellitus. Journal of the Korean Medical Association 2008; 51: 813 – 17.
[06]
Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, USA; 1984.
[07]
Fisher RA. The use of multiple measurements in taxonomic problems. Annals of Eugenics 1936; 7: 179–88.
[08]
Cortes C, Vapnik V. Support-Vector networks. Machine Learning 1995; 20: 273-97.
[09]
Moody J, Darken CJ. Fast learning in networks of locally tuned processing units. Neural Computation, 1986; 1: 281-94.
[10]
Langley P, Iba W, Thompson K. An analysis of Bayesian classifiers. Proc. of the association for the advancement of artificial intelligence-92 1992.
[11]
Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. MIT Press, Cambridge, USA; 1986.
[12]
Adhi Tama B, Rodiyatul FS, Hermansyah H. An early detection method of type-2 diabetes mellitus in public hospital. TELKOMNIKA, 2013; 9: 287-294.
[13]
Kuma VP, Velide L. A data mining approach for predicted and treatment of diabetes disease. International Journal of Science Inventions Today, 2014; 3: 73-79.
[14]
Aljumah AA, Ahamad MG, Siddiqui MK. Application of data mining: Diabetes health care in young and old patients. Journal of King Saud University-Computer and Information Science, 2013; 25: 127-136.
[15]
Sarwar A, Sharma V. Comparative analysis of machine learning techniques in prognosis of type II diabetes. AI & Society, 2014; 29: 123-29.
[16]
Jelinek HF, Yatsko A, Stranieri A, Venkatraman S. Novel data mining techniques for incomplete clinical data in diabetes management. British Journal of Applied Science & Technology, 2014; 4: 4591-4606.
[17]
Ian HW, Eibe F. Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, Burlington, USA; 2005.
[18]
Williamson DF, Parker RA, Kendrick JS. The box plot: a simple visual method to interpret data. Annals of Internal Medicine 1989; 110: 916-21.
[19]
Oommen T, Misra D, Twarakavi NKC, Prakash A, Sahoo B, Bandopadhyay S. An objective analysis of support vector machine based classification for remote sensing. Mathematical Geosciences 2008; 40: 409-24.
[20]
Mark AH. Correlation-based feature selection for machine learning. Ph.D Dissertation, University of Waikato, Canada; 1999.
[21]
Mark H, Geoffrey H. Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 2003; 15: 1436-47.
[22]
Kohavi R, John GH. Wrappers for feature subset selection. Artificial Intelligence 1997; 97: 273-324.
[23]
23. Pudil P, Novovicova J, Kittler J. Floating search methods in feature selection. Pattern Recognition Letter 1994; 15: 1119-25.
[24]
Gütlein M, Frank E, Hall M, Karwath A. Large-scale attribute selection using wrappers. IEEE Symposium on Computational Intelligence and Data Mining 2009.
[25]
Mark H, Eibe F, Geoffrey H, Bernhard P, Peter R, Ian H. The WEKA Data Mining Software, An Update SIGKDD Explorations 11 2009.