Indice

Data Mining for Customer Relationship Management 2013

News

Goals

Organizations and business are overwhelmed by the flood of data continuously collected into their data warehouses and arriving from external sources – the Web above all. Traditional exploratory techniques may fail to make sense of the data, due to its inherent complexity and size. Data mining and knowledge discovery techniques emerged as an alternative approach, aimed at revealing patterns, rules and models hidden in the data, and at supporting the analytical user to develop descriptive and predictive models for a number of business problems. This short course focusses on the main applications scenarios of data mining to challenging problems in the broad CRM domain - Customer Relationship Management.

Syllabus

Textbooks

Reading about the "data analyst" job

Calendar

Date Topic Learning material
1. 13.05.2013 - 09:00-18:00 Pattern and association rule mining & market basket analysis + Exercises slides
2. 14.05.2013 - 09:00-18:00 Clustering analysis & customer segmentation + Exercises slides slides
3. 15.05.2013 - 09:00-18:00 Prediction models for promotion performance and churn analysis + Exercises slides slides slides slides
4. 16.05.2013 - 09:00-18:00 Mobility data mining & analysis of human movement behavior. Fraud detection paper paper
5. 20.05.2013 - 09:00-18:00 Social network analysis: fundamentals. Network metrics. Small world. Strength of weak ties. Centrality. Preferential attachment. + Exercises slides link
6. 21.05.2013 - 09:00-18:00 Models of social contagion, spreading, diffusion and epidemics, with applications to viral marketing. Analysis of innovators/early adopters. + Privacy & Data Mining slides slides slides slides
7. Tutorial on Knime slides

Exercises

1. Market Basket Analysis. Problem: given a database of transactions of customers of a supermarket, find the set of frequent items co-purchased and analyse the association rules that is possible to derive from the frequent patterns. Provide a short document (max three pages in pdf, excluding figures/plots) which illustrates the input dataset, the adopted frequent pattern algorithm and the association rule analysis.

2. Customer segmentation with k-means. Problem: given the dataset of RFM (Recency, Frequency and Monetary value) measurements of a set of customers of a supermarket, find a high-quality clustering using K-means and discuss the profile of each found cluster (in terms of the purchasing behavior of the customers of each cluster). Provide a short document (max three pages in pdf, excluding figures/plots) which illustrates the input dataset, the adopted clustering methodology and the cluster interpretation. Dataset filename: mkt_rfm_alltgt_allrec_norm.csv. Dataset legend: for each customer, the dataset contains the recency, frequency and monetary value variables (relative to all purchases, to purchases of fresh food articles, to canned food articles and no-food articles; the variables are present both with original and normalized values):

3. Churn analysis with decision trees. Problem: given a dataset of measurements over a set of customers of an e-commenrce site, find a high-quality classifier, using decision trees, which predicts whether each customers will place only one or more orders to the shop. The explanation of the available variables is here. Provide a short document (max three pages in pdf, excluding figures/plots) which illustrates the input dataset, the adopted classification methodology and the decision tree validation and interpretation. Dataset filename: ChurnAnalysis.arff.

Deadline: the three documents must be sent email to all instructors within 30 June, 2013. Specify [MAINS] in the subject of the email.

Exams

The exam of the CRM module consists in the evaluation of the reports of the proposed exercises.

Previous editions