R and Data Mining introduces using R for data mining. Data mining techniques are widely used in government agencies, banks, insurance, retail, telecom, medicine and research. Recently, there is an increasing tendency to do data mining with R, a free software environment for statistical computing and graphics. According to a poll by KDnuggets.com in early 2011, R is the 2nd popular tool for data mining work.
By introducing using R for data mining, R and Data Mining will have a broad audience from both academia and industry. It targets researchers in the field of data mining, postgraduate students who are interested in data mining, and data miners and analysts from industry. For example, many universities have courses on data mining, and the proposed book will be a useful reference for students learning data mining in those courses. There are also many training courses on data mining in industry, such as training by SAS and IBM on data mining.
R and Data Mining will be of interest to the course learners as well. It presents an introduction into using R for data mining applications, covering most popular data mining techniques. It provides code examples and data so that readers can easily learn the techniques. It features case studies in real-world applications to help readers apply the techniques in their work.
Introduction
Introduction, Data mining
R
Datasets used in this book
Data Loading and Exploration
Data Import/Export
Save/Load R Data
Import from and Export to .CSV Files
Import Data from SAS
Import/Export via ODBC
Data Exploration
Have a Look at Data
Explore Individual Variables
Explore Multiple Variables
More Exploration
Save Charts as Files
Data Mining Examples
Decision Trees
Building Decision Trees with Package party
Building Decision Trees with Package rpart
Random Forest
Regression
Linear Regression
Logistic Regression
Generalized Linear Regression
Non-linear Regression
Clustering
K-means Clustering
Hierarchical Clustering
Density-based Clustering
Outlier Detection
Time Series Analysis
Time Series Decomposition
Time Series Forecast
Association Rules
Sequential Patterns
Text Mining
Social Network Analysis
Case Studies
Case Study I: Analysis and Forecasting of House Price Indices
Reading Data from a CSV File
Data Exploration
Time Series Decomposition
Time Series Forecasting
Discussion
Case Study II: Customer Response Prediction
Case Study III: Risk Rating using Decision Tree with Limited Resources
Customer Behaviour Prediction and Intervention
Appendix
Online Resources
R Reference Card for Data Mining
Bibliography
Yanchang Zhao is a Senior Data Mining Analyst in Australia Government since 2009. Before joining public sector, he was an Australian Postdoctoral Fellow (Industry) in the Faculty of Engineering & Information Technology at University of Technology, Sydney, Australia. His research interests include clustering, association rules, time series, outlier detection and data mining applications and he has over forty papers published in journals and conference proceedings. He is a member of the IEEE and a member of the Institute of Analytics Professionals of Australia, and served as program committee member for more than thirty international conferences.