1,378
Views
4
CrossRef citations to date
0
Altmetric
Book Reviews

Data mining with Rattle and R

Page 464 | Published online: 29 Nov 2012

Data mining with Rattle and R, by Graham Williams, New York, Springer, 2011, xx + 374 pp., £ 49.99 or US$64.95 (paperback), ISBN 978-1-4419-9889-7

In this book, Graham Williams presents the reader with a comprehensive treatment of data mining from data understanding and preparation through model development, evaluation and refinement to practical deployment. Structured in four main parts – exploration, model building, performance and appendices – the book provides a coherent link between data, tools, models and performance. The first seven chapters, focusing on the fundamentals of R and Rattle (a graphical interface for data mining using R), data formats, distributions and visualisation, highlight the R–Rattle exploratory power. Unsupervised and supervised modelling techniques are detailed in the second part of the book followed by performance assessment and deployment in the third Part. Deriving from this structure is one of the book's distinctive features: its focus on the hands-on end-to-end process of data mining using open source tools Rattle and R which makes it particularly interesting to both students and practitioners of data mining.

It is quite possible for the R novice to find this book hard to access due to its substantial content of R Graphical User Interface and programming skills. However, while this may appear to be a downside at first glance, reading the book reveals that its structure and writing style make it easily adaptable to other software applications. Further, despite both R and Rattle being version-variant, the book is cushioned against susceptibility to version obsolescence by well-balanced and integrated discussions of the data mining process and the adopted tools and methods. Thus, data mining students and practitioners with or without a working knowledge of R will find this book to be at least a good supplement to their existing tools and procedures. As a regular R user in a data mining environment, I found the book extremely useful and insightful with great potentials for improvement. In particular, there is scope for enhancing the discussions of the tuning parameters for each of the models in Part II as they are fundamental to data mining results. For instance, expanding on the role of the cost and sigma parameters on pages 299 and 300 may provide useful intuition to the R novice.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.