493
Views
179
CrossRef citations to date
0
Altmetric
Original Articles

STATLOG: COMPARISON OF CLASSIFICATION ALGORITHMS ON LARGE REAL-WORLD PROBLEMS

, &
Pages 289-333 | Published online: 15 May 2007
 

Abstract

This paper describes work in the StatLog project comparing classification algorithms on large real-world problems. The algorithms compared were from symbolic learning (CART. C4.5, NewID, AC2,ITrule, Cal5, CN2), statistics (Naive Bayes, k-nearest neighbor, kernel density, linear discriminant, quadratic discriminant, logistic regression, projection pursuit, Bayesian networks), and neural networks (backpropagation, radial basis functions). Twelve datasets were used: five from image analysis, three from medicine, and two each from engineering and finance. We found that which algorithm performed best depended critically on the data set investigated. We therefore developed a set of data set descriptors to help decide which algorithms are suited to particular data sets. For example, data sets with extreme distributions (skew > l and kurtosis > 7) and with many binary/categorical attributes (>38%) tend to favor symbolic learning algorithms. We suggest how classification algorithms can be extended in a number of directions.

Additional information

Notes on contributors

R. D. KING

Address correspondence to Dr. Ross D. King, Biomolecular Modelling Laboratory,Imperial Cancer Research Fund, P. O. Box 123,44 Lincoln's Inn Fields, London WC2A 3PX, UK. E-mail: [email protected].

C. FENG

Present address of C. Feng is Computer Science Department, Ottawa University, Ottawa, Ontario, Canada. E-mail: [email protected].

A. SUTHERLAND

Present address of A. Sutherland is Hitachi Dublin Laboratory, Dublin, Ireland.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.