Indice

Data Mining A.A. 2015/16

DM 1: Foundations of Data Mining

Instructors - Docenti:

Teaching assistant - Assistente:

DM 2: Advanced topics on Data Mining and case studies

Instructors:

News

1 Rizzi-Romano-Scigliuzzo 0,8134; 2 Criscolo-Quintini-Trafficante 0,80383; 3 Bazzali-Borghi-Giannella 0,79904; 3 Deidda-Policardo-Salamida 0,79904; 3 DelleMacchie-Iavarone-Rambelli 0,79904; 3 Kocan-Erdem 0,79904; 3 Stili-Strazzulla-Gaggioli 0,79904; 4 Calamia-Ortolani-Tardelli 0,79426; 5 Abedini-Baltakiene 0,78947; 5 Loconte-Spontella-Di Modugno 0,7894;

Abedini_Baltakiene, Alzetta_Miaschi_Semplici, Bambini_Catania_Incorvaia, Bazzali_Borghi_Giannella, Boncoraglio_Delicto_Veshi, Calamia_Ortolani_Tardelli, Criscolo_Quintini_Trafficante, Deidda_Policardo_Salamida, DelleMacchie_Iavarone_Rambelli, Donati, Dossena_Grossi_LaPerna, Fuccio_Furlan_LaPusata, Gentile_Miliani_Rossi, Giacalone_Montisci_Salerno, Kocan_Erdem, LaCroce, Loconte_Spontella_DiModugno, Rizzi_Romano_Scigliuzzo, Russo, Stili_Strazzulla_Gaggioli, Xu

Learning goals -- Obiettivi del corso

… a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.

Data, data everywhere. The Economist, Special Report on Big Data, Feb. 2010.

La grande disponibilità di dati provenienti da database relazionali, dal web o da altre sorgenti motiva lo studio di tecniche di analisi dei dati che permettano una migliore comprensione ed un più facile utilizzo dei risultati nei processi decisionali. L'obiettivo del corso è quello di fornire un'introduzione ai concetti di base del processo di estrazione di conoscenza, alle principali tecniche di data mining ed ai relativi algoritmi. Particolare enfasi è dedicata agli aspetti metodologici presentati mediante alcune classi di applicazioni paradigmatiche quali il Basket Market Analysis, la segmentazione di mercato, il rilevamento di frodi. Infine il corso introduce gli aspetti di privacy ed etici inerenti all’utilizzo di tecniche inferenza sui dati e dei quali l’analista deve essere a conoscenza. Il corso consiste delle seguenti parti:

  1. i concetti di base del processo di estrazione della conoscenza: studio e preparazione dei dati, forme dei dati, misure e similarità dei dati;
  2. le principali tecniche di datamining (regole associative, classificazione e clustering). Di queste tecniche si studieranno gli aspetti formali e implementativi;
  3. alcuni casi di studio nell’ambito del marketing e del supporto alla gestione clienti, del rilevamento di frodi e di studi epidemiologici.
  4. l’ultima parte del corso ha l’obiettivo di introdurre gli aspetti di privacy ed etici inerenti all’utilizzo di tecniche inferenza sui dati e dei quali l’analista deve essere a conoscenza

Reading about the "data scientist" job

Hours - Orario e Aule

DM 1

Classes - Lezioni

Giorno Orario Aula
Lunedì/Monday 16:00 - 18:00 Aula C
Venerdì/Friday 14:00 - 16:00 Aula A1

Office hours - Ricevimento:

DM 2

Classes - Lezioni

Day of week Hour Room
Monday 9:00 - 11:00 Room N1
Thursday 9:00 - 11:00 Room A1

Office hours - Ricevimento:

Learning Material -- Materiale didattico

Textbook -- Libro di Testo

Slides of the classes -- Slides del corso

Testi di esame

Data mining software

Class calendar - Calendario delle lezioni (2015-2016)

First part of course, first semester (DMF - Data mining: foundations)

Day Aula Topic Learning material Instructor
1. 21.09.2015 16:00-18:00 C Canceled -
2. 25.09.2015 14:00-16:00 A1 Overview 1.dm-overview.pdf Pedreschi/Monreale
3. 28.09.2015 16:00-18:00 C Introduction 2.dm_ml_introduction.pdf Pedreschi
4. 02.10.2015 14:00-16:00 A1 Introduction 2.dm_ml_introduction.pdf Monreale
5. 05.10.2015 16:00-18:00 C Data Understanding3.dataunderstanding.pdf 3.data-understanting-appendix.pdf Monreale
6. 09.10.2015 14:00-16:00 A1 Data Preparation 4.data_preparation.pdf Monreale
7. 12.10.2015 16:00-18:00 C Clustering analysis. Centroid-based methods.dm2014_clustering_intro.pdf dm2014_clustering_kmeans.pdf Monreale
8. 16.10.2015 14:00-16:00 A1 Clustering analysis. Hierarchical methods. Tutorial Knime dm2014_clustering_hierarchical.pdf knime_slides_mains.pdf Monreale
9. 19.10.2015 16:00-18:00 C Clustering Analysis. Density Based Clustering and Validation dm2014_clustering_dbscan.pdf dm2014_clustering_validation.pdf Monreale
10. 21.10.2015 16:00-18:00 C Exercises on Data Understanding. exercises-dm1.pdf Monreale
11. 23.10.2015 14:00-16:00 A1 Exercises on Clustering. HC with Group Average exercises-clustering.pdf Monreale/Guidotti
12. 26.10.2015 16:00-18:00 C Knime Exercises datamanipulation.zip knime_clustering_iris.zip Pedreschi/Guidotti
13. 30.10.2015 14:00-16:00 A1 R and Python Exercises manipulation-clystering-r.zip manipulation-clustering-py.zip Pedreschi/Guidotti
02.11.2015-06.11.2015 First Mid-term test: 6th November 14:00-16:00 Room A
14. 09.11.2015 16:00-18:00 C Classification chap4_basic_classification.pdf Monreale
15. 13.11.2015 14:00-16:00 A1 Classification Monreale
16. 16.11.2015 16:00-18:00 C Classification Monreale
17. 20.11.2015 14:00-16:00 A1 Classification Monreale
18. 23.11.2015 16:00-18:00 C Exercises on Classification. Knime Exercises knime_classification_iris.zip knime_classification_adult.zip knime_classification_over_adult.zip Guidotti/Monreale
19. 27.11.2015 14:00-16:00 A1 Frequent Patterns & Association Rules 4-5tdm-restructured_assoc.pdf Monreale
20. 30.11.2015 16:00-18:00 C Canceled
21. 04.12.2015 14:00-16:00 A1 Canceled
22. 07.12.2015 16:00-18:00 C Canceled Pedreschi
23. 11.12.2015 14:00-16:00 A1 Exercises on Patterns. Knime Exercises knime_pattern.zip Guidotti / Pedreschi
24. 14.12.2015 16:00-18:00 C python-classification-pattern.zip r-classification-patterns.zip Guidotti / Pedreschi
16.12.2015-18.12.2015 Second Mid-term test

Second part of course, second semester (DMA - Data mining: advanced topics and case studies)

Day Aula Topic Learning material Instructor
1. 22.02.2016 09:00-11:00 N1 Introduction + Sequential Patterns / 1 sequential_patterns.pdf, textbook Ch. 7.4 Nanni & Pedreschi
2. 25.02.2015 09:00-11:00 A1 Sequential Patterns / 2
3. 29.02.2015 09:00-11:00 A1 Sequential Patterns / Exercises Link to SPMF, a tool for seq. patterns and sample dataset. Exercises: Text 1 and Text 2
4. 03.03.2015 09:00-11:00 A1 Advanced Classification Methods / 1 alternative_classification_1_dino_03.03.2016.pdf Pedreschi
5. 07.03.2015 09:00-11:00 A1 Advanced Classification Methods / 2 alternative_classification_2_dino_07.03.2016.pdf Pedreschi
6. 10.03.2015 09:00-11:00 A1 Advanced Classification Methods / Tools and Exercises exercises_classification.pdf sample_knime_workflows.zip
7. 14.03.2015 09:00-11:00 A1 Advanced Classification Methods / Exercises Exercises (also) on classification from 2014-15
8. 17.03.2015 09:00-11:00 A1 Time Series / 1 time_series_from_keogh_tutorial.pdf
9. 21.03.2015 09:00-11:00 A1 Time Series / 2
10. 24.03.2015 09:00-11:00 A1 Time Series / Exercises Some exercises from past exams: (Sequences and time series) (Classification)
25-29.03.2015 EASTER HOLIDAYS
04.04.2015 09:00-13:00 TBD Midterm tests
11. 07.04.2015 09:00-11:00 A1 Case study: CRM - Customer Segmentation + CRISP-DM Customer segmentation CRISP-DM
12. 11.04.2015 09:00-11:00 A1 Case study: CRM - Churn Analysis Intro_CRM Churn External_Churn
13. 14.04.2015 09:00-11:00 A1 Case study: CRM - Promotions and Sophistication Promotions Sophistication
14. 18.04.2015 09:00-11:00 A1 Mobility Data Analysis / 1 Preprocessing Patterns and models
15. 21.04.2015 09:00-11:00 A1 Mobility Data Analysis / 2 Individual/Collective models GSM_DM
16. 28.04.2015 09:00-11:00 A1 Case study: Mobility Data Analysis Case studies
17. 02.05.2015 09:00-11:00 A1 Complements: Ethical Issues / 1 slides Monreale
18. 05.05.2015 09:00-11:00 A1 Complements: Ethical Issues / 2 Monreale
19. 09.05.2015 09:00-11:00 A1 Projects presentation Projects
20. 12.05.2015 09:00-11:00 A1 Complements: Outlier Detection Slides from SDM2010 tutorial
21. 16.05.2015 09:00-11:00 A1 Projects discussion

Exams

Exam DM part I (DMF)

The exam is composed of three parts:

Guidelines for the project are here.

Exam DM part II (DMA)

The exam is composed of three parts:

Appelli di esame

Mid-term exams

Date Hour Place Notes Marks
First Mid-term 2015 Friday 06.11.2015 14.00 Room A Results
Second Mid-term 2015 Wednesday 16.12.2015 11.00 Room A1 Results
Date Hour Place Notes Marks
Mid-term 2016 Monday 04.04.2016 9.00 Room A1 Results

Appelli regolari / Exam sessions

Session Date Time Room Notes Results
1. Monday 18 January 2016 9.00 A1 In the same date we will define the dates for the oral exam.
2. Monday 08 February 20169.00A1 In the same date we will define the dates for the oral exam.
3. Monday, 30 May 20169.00C In the same date we will define the dates for the oral exam. DM1: Written exam results DM2: Written exam results
4. Monday, 20 June 20169.00C In the same date we will define the dates for the oral exam.
5. Friday, 08 July 20169.00C In the same date we will define the dates for the oral exam.
6. Monday, 05 Sept 20169.00C In the same date we will define the dates for the oral exam.

Appelli straordinari A.A. 2014/15 / Extra sessions A.A. 2014/15

Date Time Room Notes Results
6 November 2015 14:00-16:00 Room A
04 April 2016 9.00-13:00 Room A1

Edizioni anni precedenti