Comparative Evaluation of Classification Algorithms for the Diagnosis of Polycystic Ovary Syndrome
Abstract
Polycystic Ovary Syndrome (PCOS) is a complex hormonal disorder that affects women's reproductive and metabolic health. Early detection is essential to prevent long-term complications. This study aims to analyze and compare the performance of four machine learning classification algorithms, namely Naive Bayes, K-Nearest Neighbor (KNN), Decision Tree, and Support Vector Machine (SVM), in assisting the diagnosis of PCOS based on clinical data. The dataset used contains 1000 patient data with five main features: age, body mass index (BMI), menstrual irregularities, testosterone levels, and antral follicle count. The data were divided using stratified sampling (80:20) and validated using the k-fold cross-validation technique (k=5). Model evaluation used accuracy, precision, recall, F1-score, and AUC metrics. The results showed that Decision Tree had the best performance (100% accuracy, AUC 0.997), followed by SVM (97% accuracy) and KNN (96%). Naive Bayes had the lowest accuracy (72%) and produced many false positives. Although Decision Tree is superior, there is a risk of overfitting, while SVM and KNN show more stable performance. This study confirms that classification algorithms, especially SVM and KNN, are effective for PCOS diagnosis based on clinical data. The practical implication of this finding is the development of accurate and efficient clinical decision support systems to improve women's healthcare.
Full Text:
PDFReferences
H. J. Teede et al., “HHS Public Access Author manuscript Fertil Steril. Author manuscript; available in PMC 2020 January 02. Published in final edited form as: Fertil Steril. 2018 August ; 110(3): 364–379. doi:10.1016/j.fertnstert.2018.05.004. Recommendations from the interna,” Heal. Hum. Serv., vol. 110, no. 3, pp. 364–379, 2018, doi: 10.1016/j.fertnstert.2018.05.004.Recommendations.
S. Pililis et al., “The Cardiometabolic Risk in Women with Polycystic Ovarian Syndrome (PCOS): From Pathophysiology to Diagnosis and Treatment,” Med., vol. 60, no. 10, 2024, doi: 10.3390/medicina60101656.
Y. Che, J. Yu, Y. S. Li, Y. C. Zhu, and T. Tao, “Polycystic Ovary Syndrome: Challenges and Possible Solutions,” J. Clin. Med., vol. 12, no. 4, 2023, doi: 10.3390/jcm12041500.
A. Rajkomar, J. Dean, and I. Kohane, “Machine Learning in Medicine,” N. Engl. J. Med., vol. 380, no. 14, pp. 1347–1358, 2019, doi: 10.1056/nejmra1814259.
K. Dissanayake and M. G. M. Johar, “Comparative study on heart disease prediction using feature selection techniques on classification algorithms,” Appl. Comput. Intell. Soft Comput., vol. 2021, 2021, doi: 10.1155/2021/5581806.
M. Alwateer, A. M. Almars, K. N. Areed, M. A. Elhosseini, A. Y. Haikal, and M. Badawy, “Ambient healthcare approach with hybrid whale optimization algorithm and Naïve Bayes classifier,” Sensors, vol. 21, no. 13, pp. 1–21, 2021, doi: 10.3390/s21134579.
T. Muthia and Y. E. Putra, “Perbandingan Akurasi Model Pembelajaran Mesin SVM , KNN , Decision Tree , dan Naive Bayes pada Klasifikasi Gangguan Kesehatan Mental.”
D. Derisma and F. Febrian, “Perbandingan Teknik Klasifikasi Neural Network, Support Vector Machine, dan Naive Bayes dalam Mendeteksi Kanker Payudara,” Bina Insa. Ict J., vol. 7, no. 1, p. 53, 2020, doi: 10.51211/biict.v7i1.1343.
M. Abdul Jabbar, E. Hasmin, C. Susanto, W. Musu, and I. Artikel, “Komparasi Algoritma Decision Tree, Naive Bayes, dan K-Nearest Neighbors dalam Klasifikasi Kanker Payudara Comparison of Decision Tree Algorithms, Naive Bayes, and K-Nearest Neighbors in Breast Cancer Classification,” Oktober, vol. 14, no. 3, pp. 258–270, 2022, [Online]. Available: https://www.doi.org/10.22303/csrid.14.3.2022.258-270.
K. Akmal, A. Faqih, and F. Dikananda, “Perbandingan Metode Algoritma Naïve Bayes Dan K-Nearest Neighbors Untuk Klasifikasi Penyakit Stroke,” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 1, pp. 470–477, 2023, doi: 10.36040/jati.v7i1.6367.
S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, “Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction,” Sci. Rep., vol. 12, no. 1, pp. 1–11, 2022, doi: 10.1038/s41598-022-10358-x.
N. A. Maulidiyyah, T. Trimono, A. T. Damaliana, and D. A. Prasetya, “Comparison of Decision Tree and Random Forest Methods in the Classification of Diabetes Mellitus,” JIKO (Jurnal Inform. dan Komputer), vol. 7, no. 2, pp. 79–87, 2024, doi: 10.33387/jiko.v7i2.8316.
M. M. Nishat et al., “A Comprehensive Analysis on Detecting Chronic Kidney Disease by Employing Machine Learning Algorithms,” EAI Endorsed Trans. Pervasive Heal. Technol., vol. 7, no. 29, pp. 1–12, 2021, doi: 10.4108/eai.13-8-2021.170671.
DOI: https://doi.org/10.29040/ijcis.v6i2.233
Article Metrics
Abstract view : 16 timesPDF - 3 times
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 4.0 International License
















