A comparison of Two Decision Tree Generating Algorithms: C4.5 and CART Based on Numerical Data


Student Name: SATYA KUNDETI
Defense Date:
Location: 2001B Eaton Hall
Chair: Jerzy Grzymala-Busse

Luke Huan

Bo Luo

Abstract:

In Data Mining, classification of data is a challenging task. One of the most popular techniques for classifying data is decision tree induction. In this project, two decision tree generating algorithms CART and C4.5, using their original implementations, are compared on different numerical data sets, taken from University of California Irvine (UCI). The comparative analysis of these two implementations is carried out in terms of accuracy and decision tree complexity. Results from experiments show that there is statistically insignificant difference(5% level of significance, two-tailed test)between C4.5 and CART in terms of accuracy. On the other hand, decision trees generated by C4.5 and CART have significant statistical difference in terms of their complexity.