PENERAPAN DECISION TREE C4.5 SEBAGAI SELEKSI FITUR DAN SUPPORT VECTOR MACHINE (SVM) UNTUK DIAGNOSA KANKER PAYUDARA

Pakarti Riswanto, RZ. Abdul Aziz, Sriyanto -

Abstract


In the field of medicine, the use of data mining has a quite important and evolutionary role that can change the perspective of doctors, practitioners and health researchers in the process of detecting breast cancer in a patient. There are 2 classification applications in it, namely the process of diagnosing (diagnosing) cancer cells that distinguishes between tumors (benign cancer) or malignant cancer and prognosis (prognosis) to determine the possibility of reappearance of cancer cells in patients who have been operated on in the future. Data mining aims to describe new findings in the dataset and explain a process that uses statistical, mathematical, artificial intelligence, and machine learning techniques to extract and identify useful information and related knowledge from the database.

Classification with data mining can be done using several methods, namely Decision Tree, K-Nearest Neighbor, Naive Bayes, ID3, CART, Linear Discriminant Analysis, etc., which certainly have advantages and disadvantages of each. But in this study, the author focuses on the classification of data mining using the Support Vector Mechine and Deccision Tree algorithms.

This study will analyze the Breast Cancer Wisconsin Original data set obtained from the UCI Machine Learning Repository (repository of research data) to classify breast cancer malignancies. This time the author correlates between the Decision Tree classifier algorithm which has good ability to process large databases as a feature selection, then with a proper and relevant SVM Method used in analyzing and diagnosing breast breast cancer patients because it has accurate results for existing problems and several bases .

 

Keywords Data Mining, diagnosis, Decision Tree, SVM Method


Keywords


Data Mining; diagnosis; Decision Tree; Metode SVM

Full Text:

PDF

References


Algoritma Data Mining.Yogyakarta: Andi Publishing.D. T. Larose, Discovering Knowledge in Data: An Introduction to Data Mining. United States of America: John Wiley & Sons, Inc, 2005.

Technical and P. Series, Guidelines for management of breast cancer. World Health Organization, 2006.Kusrini, & Luthfi, E. T. (2009).

Gorunesco, Data Mining Concept Model Technique. Romania: Springer, 2011.

Laily Hermawanti, “Penerapan Algoritma Klasifikasi C4.5 Untuk Diagnosis Penyakit Kanker Payudara,” JURNAL SAINS DAN SENI ITS Vol. 7 No. 2, Hal 57-64 Maret 2012 Vol. 7 No. 2, Hal 57-64

Larose, D., 2006, Data Mining Mathod And Model, Canada, Inc. Hoboken, New Jersey.

J. Han and M. Kamber, Data Mining Concept dan Techniques, 2nd ed. United States of America: Diane Cerra, 2006.

[Online]. Available: http://www.breastcancer.org/symp toms/understand_bc. [Diakses 7 Agustus 2018].

Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques Third Edition, Elsevier, 2012

Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical Machine Learning Tools and Techniques 3rd Edition, Elsevier, 2011

Markus Hofmann and Ralf Klinkenberg, RapidMiner: Data Mining Use Cases and Business Analytics Applications, CRC Press Taylor & Francis Group, 2014

Daniel T. Larose, Discovering Knowledge in Data: an Introduction to Data Mining, John Wiley & Sons, 2005

Ethem Alpaydin, Introduction to Machine Learning, 3rd ed., MIT Press, 2014

Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, 2011

WHO. (2005). Data penderita kanker payudara di dunia. Dikases pada tanggal 3 Februari 2012 dari [http://www.who.int/cancer/dete- ction/braestcancer/en/index1.html].

Dinas Kesehatan Nasional.(2007). Data penderita kanker payudara di Indonesia. Diakses pada tanggal 31 januari 2011 dari [http://www.depkes.go.id/index.php/berita/press-release/1060-jika-tidak-dikendalikan-26-juta-orang-di-dunia-menderita-kanker-.html]

Keles, A., Keles, A., dan Yavuz, U. (2011). Expert System Based On Neuro-Fuzzy Rules For Diagnosis Breast Cancer.

Expert Systems with Applications. 38. 5719–5726. [4] Purwantaka, R. I. (2010).[Tugas Akhir]

Faktor-Faktor Yang Mempengaruhi Resiko Penyebab Penderita Kanker Payudara Dengan Menggunakan Pendekatan Regresi Logistik. Surabaya: Institut Teknologi Sepuluh Nopember Surabaya.

Purnami, S. W., dan Embong, A. (2008). Smooth Support Vector Machine For Breast Cancer Classification.

The 4th IMT-GT 2008 Conference on Mathematics, Statistics, and Their Applications (ICMSA08), Banda Aceh, Indonesia.

Wang, D., Shi, L., dan Heng, P. A. (2009). Automatic Detectiom of Breast Cancer in Mammogrmas using Support Vector Machines. Neurocomputing 72.3296-3302.

Huang, C-L., Liao, H-C., dan Chen, M-C. (2008). Prediction Model Building and Feature Selection With Support Vector Machine. Expert System with Application 34. 578-587.

Ellis, E.O., Schnitt, S.J., S.-Garau, X., Bussolati, G., Tavassaoli, F.A., Eusebi, V. Pathology and Genetic of Tumours of The Breast and Female Genital Organs / WHO Classification of Tumours. Washington: IARC Press; 2003. P.10, 34-6.

Kardinah (2002). Penatalaksanaan Kanker Payudara Terkini oleh Penanggulangan & Pelayanan Kanker Payudara Terpadu Paripurna R.S. Kanker Dharmais. Jakarta: Pustaka Populer Obor.

Hosmer, D. W., dan Lemeshow, S. (2000). Applied Logistic Regression. New York: John Wiley & Sons, Inc.

Agresti, A. (2002). Categorical Data Analysis, Second Edition. John Willey & Sons, New York.

Santosa, B. (2006). Data Mining: Teknik Pemanfaatan Data Untuk Keperluan Bisnis. Yogyakarta: Graha Ilmu.


Comments on this article