Strumenti Utente

Strumenti Sito


magistraleinformatica:dmi:start

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisione Revisione precedente
Prossima revisione
Revisione precedente
magistraleinformatica:dmi:start [27/11/2020 alle 19:37 (3 anni fa)]
Anna Monreale [Exams]
magistraleinformatica:dmi:start [22/03/2024 alle 20:34 (7 giorni fa)] (versione attuale)
Anna Monreale [First Semester]
Linea 12: Linea 12:
      
 ga('personalTracker.require', 'displayfeatures'); ga('personalTracker.require', 'displayfeatures');
-ga('personalTracker.send', 'pageview', 'ruggieri/teaching/dm/');+ga('personalTracker.send', 'pageview', 'courses/dminf/');
 setTimeout("ga('send','event','adjusted bounce rate','30 seconds')",30000);  setTimeout("ga('send','event','adjusted bounce rate','30 seconds')",30000); 
 </script> </script>
 <!-- End Google Analytics --> <!-- End Google Analytics -->
 +<!-- Global site tag (gtag.js) - Google Analytics -->
 +<script async src="https://www.googletagmanager.com/gtag/js?id=G-LPWY0VLB5W"></script>
 +<script>
 +  window.dataLayer = window.dataLayer || [];
 +  function gtag(){dataLayer.push(arguments);}
 +  gtag('js', new Date());
 +
 +  gtag('config', 'G-LPWY0VLB5W');
 +</script>
 +<!-- Global site tag (gtag.js) - Google Analytics -->
 +<script async src="https://www.googletagmanager.com/gtag/js?id=G-LPWY0VLB5W"></script>
 +<script>
 +  window.dataLayer = window.dataLayer || [];
 +  function gtag(){dataLayer.push(arguments);}
 +  gtag('js', new Date());
 +
 +  gtag('config', 'G-LPWY0VLB5W');
 +</script>
 <!-- Capture clicks --> <!-- Capture clicks -->
 <script> <script>
Linea 21: Linea 39:
   jQuery('a[href$=".pdf"]').click(function() {   jQuery('a[href$=".pdf"]').click(function() {
     var fname = this.href.split('/').pop();     var fname = this.href.split('/').pop();
-    ga('personalTracker.send', 'event',  'DM', 'PDFs', fname);+    ga('personalTracker.send', 'event',  'DMINF', 'PDFs', fname);
   });   });
   jQuery('a[href$=".r"]').click(function() {   jQuery('a[href$=".r"]').click(function() {
     var fname = this.href.split('/').pop();     var fname = this.href.split('/').pop();
-    ga('personalTracker.send', 'event',  'DM', 'Rs', fname);+    ga('personalTracker.send', 'event',  'DMINF', 'Rs', fname);
   });   });
   jQuery('a[href$=".zip"]').click(function() {   jQuery('a[href$=".zip"]').click(function() {
     var fname = this.href.split('/').pop();     var fname = this.href.split('/').pop();
-    ga('personalTracker.send', 'event',  'DM', 'ZIPs', fname);+    ga('personalTracker.send', 'event',  'DMINF', 'ZIPs', fname);
   });   });
   jQuery('a[href$=".mp4"]').click(function() {   jQuery('a[href$=".mp4"]').click(function() {
     var fname = this.href.split('/').pop();     var fname = this.href.split('/').pop();
-    ga('personalTracker.send', 'event',  'DM', 'Videos', fname);+    ga('personalTracker.send', 'event',  'DMINF', 'Videos', fname);
   });   });
   jQuery('a[href$=".flv"]').click(function() {   jQuery('a[href$=".flv"]').click(function() {
     var fname = this.href.split('/').pop();     var fname = this.href.split('/').pop();
-    ga('personalTracker.send', 'event',  'DM', 'Videos', fname);+    ga('personalTracker.send', 'event',  'DMINF', 'Videos', fname);
   });   });
 }); });
 </script> </script>
 </html> </html>
-====== Data Mining (309AA) - 9 CFU ======+====== Data Mining (309AA) - 9 CFU A.Y. 2023/2024 ======
  
 **Instructor:** **Instructor:**
Linea 49: Linea 67:
     * [[anna.monreale@unipi.it]]        * [[anna.monreale@unipi.it]]   
 **Teaching Assistant:** **Teaching Assistant:**
-  * **Francesca Naretto** +  * * **Lorenzo Mannocci** 
-    * KDDLab, SNS, Pisa +    * University of Pisa 
-    * [[francesca.naretto@sns.it]]  +    * [[lorenzo.mannocci@phd.unipi.it]]  
  
 ====== News ====== ====== News ======
-    * [01.10.2020] ** The lecture on 9.10.2020 will be suppressed. ** +  * [05.09.2023] ** The lectures will start on 27th September 2023**  
-    * [09.09.2020] The course will be held online, please use this link to join the class: https://teams.microsoft.com/l/team/19%3a8f6779bab74f4368ba7ce1c2b092346d%40thread.tacv2/conversations?groupId=8da15095-b6e5-41c1-a894-d418aed3983e&tenantId=c7456b31-a220-47f5-be52-473828670aa1    * + 
 ====== Learning Goals ====== ====== Learning Goals ======
      * Fundamental concepts of data knowledge and discovery.      * Fundamental concepts of data knowledge and discovery.
Linea 61: Linea 79:
      * Data preparation      * Data preparation
      * Clustering      * Clustering
-     * Classification & Regression+     * Classification
      * Pattern Mining and Association Rules      * Pattern Mining and Association Rules
      * Outlier Detection      * Outlier Detection
Linea 73: Linea 91:
  
 ^  Day of Week  ^  Hour  ^  Room  ^  ^  Day of Week  ^  Hour  ^  Room  ^ 
-|  Wednesday |  09:00 - 10:45  |  Online  |  +|  Wednesday |  09:00 - 11:00  |  Room C1  |  
-|  Thursday  |  09:00 - 10:45  |  Online  |  +|  Thursday  |  09:00 - 11:00  |  Room C1  |  
-|  Friday    |  11:00 - 12:45  |  Online  +|  Friday    |  09:00 - 11:00  |  Room C  
  
  
  
 **Office hours - Ricevimento:** **Office hours - Ricevimento:**
-Anna Monreale: Wednesday: 11:00-13:00 online using Teams (Appointment by email) +Anna Monreale: Tuesday: 11:00-13:00 by online using Teams or at the Department of Computer Science, room 374/E (Please ask an appointment by email). 
-Francesca NarettoMonday: 15:00-18:00 online using Teams (Appointment by email)+Lorenzo MannocciTDB
  
- +A [[https://teams.microsoft.com/l/team/19%3ajujTZ5yI6IyKkRl1YEGY0Iisg7RhlW1YTam_NO3-OOE1%40thread.tacv2/conversations?groupId=2ce9fd1a-3f23-47b0-92cd-8652f8be9ed6&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams Channel]] will be used ONLY to post news, Q&A, and other stuff related to the course. The lectures will be only in presence and will **NOT** be live-streamed, but recordings of the lecture or of the previous years will be made available here for non-attending students.  
 ====== Learning Material -- Materiale didattico ====== ====== Learning Material -- Materiale didattico ======
  
Linea 91: Linea 109:
     * [[http://www-users.cs.umn.edu/~kumar/dmbook/index.php]]     * [[http://www-users.cs.umn.edu/~kumar/dmbook/index.php]]
     * Chapters 4,6 and 8 are also available at the publisher's Web site.     * Chapters 4,6 and 8 are also available at the publisher's Web site.
-  * Berthold, M.R., Borgelt, C., Höppner, F., Klawonn, F. **GUIDE TO INTELLIGENT DATA ANALYSIS.** Springer Verlag, 1st Edition., 2010. ISBN 978-1-84882-259-7 
   * Laura Igual et al.** Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications**. 1st ed. 2017 Edition.   * Laura Igual et al.** Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications**. 1st ed. 2017 Edition.
   *  Jake VanderPlas. **[[http://shop.oreilly.com/product/0636920034919.do| Python Data Science Handbook: Essential Tools for Working with Data.]]** 1st Edition.    *  Jake VanderPlas. **[[http://shop.oreilly.com/product/0636920034919.do| Python Data Science Handbook: Essential Tools for Working with Data.]]** 1st Edition. 
 +  *  For Python Notions: {{ :magistraleinformatica:dmi:python_basics.ipynb.zip | Very basic notions on Python}} 
  
  
Linea 104: Linea 122:
 ===== Software===== ===== Software=====
  
-  * Python - Anaconda (3.7 version!!!): Anaconda is the leading open data science platform powered by Python. [[https://www.anaconda.com/distribution/| Download page]] (the following libraries are already included)+  * Python - Anaconda (at least 3.7 version!!!): Anaconda is the leading open data science platform powered by Python. [[https://www.anaconda.com/distribution/| Download page]] (the following libraries are already included)
   * Scikit-learn: python library with tools for data mining and data analysis [[http://scikit-learn.org/stable/ | Documentation page]]   * Scikit-learn: python library with tools for data mining and data analysis [[http://scikit-learn.org/stable/ | Documentation page]]
   * Pandas: pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. [[http://pandas.pydata.org/ | Documentation page]]   * Pandas: pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. [[http://pandas.pydata.org/ | Documentation page]]
  
    
-====== Class Calendar (2020/2021) ======+====== Class Calendar (2023/2024) ======
  
 ===== First Semester  ===== ===== First Semester  =====
  
-^ ^ Day ^ Topic ^ Learning material ^ References ^ +^ ^ Day ^ Topic ^ Learning material ^ References ^ Video Lectures 
-|1.|  16.09  09:00-10:45 | Overview. Introduction to KDD        | {{ :magistraleinformatica:dmi:1-overview.pdf |}} {{ :magistraleinformatica:dmi:1-intro-dm.pdf |}} | Chap. 1 Kumar Book|  +|1.  |  27.09  | Overview. Introduction to KDD   | {{ :magistraleinformatica:dmi:1-overview-2023.pdf |}} {{ :magistraleinformatica:dmi:1-intro-dm.pdf |}}|Chap. 1 Kumar Book |[[https://unipiit.sharepoint.com/:v:/s/a__td_59044/EYjxO1YANqtMnr8upJa3X4oBp3wEsdjef8iSXN7LL7jcxQ?e=Jd80j9|Introduction DM - Video1]] [[ https://unipiit.sharepoint.com/:v:/s/a__td_59044/Eesf2mgGU1hMjMH4qH_xJewBKtee3TWrullu269byR2bnA?e=JJ4AUx|Introduction DM - Video2]]
-|2.|  17.09  09:00-10:45 | Data Understanding | {{ :magistraleinformatica:dmi:2-data_understanding.pdf | Slides DU}} |Chap.2 Kumar Book and additioanl resource of Kumar Book:[[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Exploring Data]] If you have the first ed. of KUMAR this is the Chap 3 | +|2.  |  28.09  | Data Understanding | {{ :magistraleinformatica:dmi:2-data_understanding.pdf |}}|Chap.2 Kumar Book and additioanl resource of Kumar Book:[[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Exploring Data]] If you have the first ed. of KUMAR this is the Chap 3 |  
-|3.|  18.09  09:00-10:45 | Data Preparation        | {{ :magistraleinformatica:dmi:3-data_preparation.pdf |}} | Chap. 2 Kumar Book |  +|3.  |  29.09  | Data Understanding & Data Preparation |  {{ :magistraleinformatica:dmi:3-data_preparation.pdf |}} |Chap.2 Kumar Book and additioanl resource of Kumar Book:[[https://www-users.cs.umn.edu/~kumar001/dmbook/data_exploration_1st_edition.pdf|Exploring Data]] If you have the first ed. of KUMAR this is the Chap 3 | 
-|4.|  23.09  09:00-10:45 | Data Preparation: Transformations PCA         | {{ :magistraleinformatica:dmi:3-data_preparation.pdf |}} | Chap. 2 Kumar Book, Appendix B Dimensionality Reduction (only PCA) +|4.  |  04.10  | Data Preparation & Data Similarities  {{ :magistraleinformatica:dmi:4-data_similarity.pdf |}} | Data Similarity is in Chap. 2  | [[https://unipiit.sharepoint.com/:v:/s/a__td_59044/EWaYURxnzPdIiLiqjkS4LM8B8sme_xmm0LwtK9EptuP0Jg?e=dsZojO|DP+Similarities]] The last minutes of the lecture were not recorded because of the connection
-|5.|  24.09  09:00-10:45 | Data Similarities. Introduction to Clustering.|{{ :magistraleinformatica:dmi:4-data_similarity.pdf |}} {{ :magistraleinformatica:dmi:5-basic_cluster_analysis-intro.pdf |}}       Data Similarity is in Chap2 while Clustering is in Chap. 7  | +|5.  |  05.10  | Python-LAB: Data Understanding | {{ :magistraleinformatica:dmi:dataunderstanding.zip DU notebooks and data}} |  | [[https://unipiit.sharepoint.com/:v:/s/a__td_59044/EYWSZBIG7X1MoFOev5Th_cIBprLLN-AwSBMamgGzNju0Sw?e=jzdPx8|Python Lab on DU]]| 
-|6.|  25.09  11:00-12:45 LAB: Data Understanding in Python |  {{ :magistraleinformatica:dmi:python_basics.ipynb.zip Very basic notions on Python}} {{ :magistraleinformatica:dmi:tips_data_understanding.ipynb.zip |Notebook on Data Understanding}}  {{ :magistraleinformatica:dmi:tipsdata.zip |}}|  +|  |  06.10  | Suppressed |  |  | 
-|7.|  30.09  09:00-10:45 Center-based clusteringkmeans| {{ :magistraleinformatica:dmi:6-basic_cluster_analysis-kmeans-variants.pdf |}}  | Chap. 7 Kumar Book| +|6.  |  11.10  | Introduction to Clustering. Centroid-based ClusteringK-means algorithm. | {{ :magistraleinformatica:dmi:5-basic_cluster_analysis-intro.pdf |}} {{ :magistraleinformatica:dmi:6.1-basic_cluster_analysis-kmeans.pdf |}} | Chap. 7 Kumar Book | [[https://unipiit.sharepoint.com/:v:/s/a__td_54794/EV-fDd75MIxGmazA79kFHCYBI78yYwqy7AFE5h9MN2rRqg?e=YVgdjS|Video 1: Introduction to Clustering + K-means - Part 1]] - Video of previous years
-|8.|  01.10  09:00-10:45 | Center-based clusteringBisecting K-means, Xmeans, EM| Same Slides of the previous lectures Chap. 7 Kumar Book, {{ :magistraleinformatica:dmi:clusteringmixturemodels.pdf Clustering & Mixture Models}} {{ :magistraleinformatica:dmi:xmeans.pdf |}}+|7.  |  12.10  Centroid-based ClusteringK-means variants. | {{ :magistraleinformatica:dmi:6.2-basic_cluster_analysis-kmeans-variants.pdf |}} | Chap. 7 Kumar Book {{ :magistraleinformatica:dmi:clusteringmixturemodels.pdf |}} {{ :magistraleinformatica:dmi:xmeans.pdf |}}| [[https://unipiit.sharepoint.com/:v:/s/a__td_54794/ETySd1UWIzxCoAKilzaXO_MBW8oXZZCjf5FEhyywGIdJBg?e=Xq2jdo|Video 2: Introduction to Clustering + K-means - Part 2]]]  [[https://unipiit.sharepoint.com/:v:/s/a__td_54794/EQTbbvqF2kJOgEsFQ1WF48cBjWf2wgTCbOjxcQzn9MyVzw?e=KQ7gEZ|Video 1: Center-based clustering Bisecting K-means, Xmeans, EM ]];Videos of previous years| 
-|9.|  02.10  11:00-12:45 | Hierarchical clustering| {{ :magistraleinformatica:dmi:7.basic_cluster_analysis-hierarchical.pdf |}} {{ :magistraleinformatica:dmi:ex._hierarchical-clustering.pdf |}}| Chap. 7 Kumar Book |  +|   13.10  Suspension of teaching |  |  | Recording in Teams Channel 
-|10.|  07.10  09:00-10:45 | Density based clustering|{{ :magistraleinformatica:dmi:8.basic_cluster_analysis-dbscan-validity.pdf |}}  | Chap. 7 Kumar Book |  +|8.|  18.10  | Hierarchical and density based CLustering | {{ :magistraleinformatica:dmi:7.basic_cluster_analysis-hierarchical.pdf |}} {{ :magistraleinformatica:dmi:8.basic_cluster_analysis-dbscan-validity.pdf |}} |  Chap. 7 Kumar Book | Recording in Teams Channel  
-|11.|  08.10  09:00-10:45 | Lab: clustering Project Assignment | {{ :magistraleinformatica:dmi:py-clustering.zip |}} |  |  +|9.|  19.10  | Clustering Validity & Python LabClusterig K-means | {{ :magistraleinformatica:dmi:8.basic_cluster_analysis-dbscan-validity.pdf |}} |  Chap. 7 Kumar Book| Recording in Teams Channel  
-|    09.10  11:00-12:45 | Lecture canceled |  |  |  +|10.|  20.10 | Python Lab: Clusterig Density based and hierarchical Introduction to Classification |{{ :magistraleinformatica:dmi:clustering.zip | Notebook on Clustering}} {{ :magistraleinformatica:dmi:9.chap3_basic_classification-2023.pdf |}} | Chap.3 Kumar Book |Recording in Teams Channel 
-|12.|  14.10  09:00-10:45 | Classification Problem + Decision trees|  {{ :magistraleinformatica:dmi:9.chap3_basic_classification-2020.pdf |}}|  Chap. 3 Kumar Book |  +|11.|  25.10 | Decision Trees & Classifier Evaluation | Same slides as previous lecture | Chap.3 Kumar Book | Recording in Teams Channel   |   
-|13.|  15.10  09:00-10:45 | Only 30 minutes of Discussion on the project due to connection problems|  |  Chap. 3 Kumar Book |  +|12.|  26.10 | Classifier Evaluation | Same slides as previous lecture | Chap.3 Kumar Book |   |   
-|14.|  16.10  11:00-12:45 | Decision Tree + Classifier Evaluation|   Chap. 3 Kumar Book |  +|13.|  27.10 | Rule-based Classifiers |{{ :magistraleinformatica:dmi:10-rule-based-classifiers.pdf |}} | Chap.Kumar Book |  Recording in Teams Channel    
-|15.|  21.10  09:00-10:45 | Evaluation Methods for Classification Models {{ :magistraleinformatica:dmi:9.chap3_basic_classification-2020.pdf |}}|  Chap. 3 Kumar Book + Chap. 4 Kumar Book|  +|14.|  02.11 | Rule-based Classifiers + Instance based Classifiers| {{ :magistraleinformatica:dmi:10-knn.pdf |}}| Chap.4 Kumar Book | Recording in Teams Channel   |   
-|16.|  22.10  09:00-10:45 Statistical tool for model evaluation + Rule based classification| {{ :magistraleinformatica:dmi:10-rule-based-clussifiers.pdf |}} |  Chap. Kumar Book  Chap. 4 Kumar Book|  +|15.|  03.11 |Naive Bayesian Classifier. SVM. Ensemble Classifiers| {{ :magistraleinformatica:dmi:11_2023-naive_bayes.pdf |}} {{ :magistraleinformatica:dmi:14_svm_2023.pdf |}} {{ :magistraleinformatica:dmi:13_ensemble_2023.pdf |}}| Chap.4 Kumar Book | Recording in Teams Channel   |   
-|17.|  23.10  11:00-12:45 | Rule based classification + Instance-based Classification| {{ :magistraleinformatica:dmi:11-knn.pptx |}} |  Chap. 4 Kumar Book |  +|16.|  08.11 | Python LabClassification|  {{ :magistraleinformatica:dmi:classification.zip |}} | | Recording in Teams Channel     
-|18.|  28.10  09:00-10:45 |Naive Bayesian Classifier Ensemble Classifieres | {{ :magistraleinformatica:dmi:12-naive_bayes.pdf |}} {{ :magistraleinformatica:dmi:13_ensemble_2020.pdf |}} |  Chap. 4 Kumar Book |  +|17.|  09.11 | NN Classifiers| {{ :magistraleinformatica:dmi:15_neural_networks_2023.pdf |}} | Chap.4 Kumar Book | Recording in Teams Channel   |   
-|19.|  29.10  09:00-10:45 | SVM & NN |  {{ :magistraleinformatica:dmi:14_svm_2020.pdf |}} {{ :magistraleinformatica:dmi:15_neural_networks_2020.pdf |}}|  Chap. 4 Kumar Book |  +|18.|  10.11 | Python LabNN Imbalanced Classification | {{ :magistraleinformatica:dmi:imbalanced_classification.zip |}} |  | Recording in Teams Channel   |   
-|20.|  30.10  11:00-12:45 | MLNN Lab on Classification| {{ :magistraleinformatica:dmi:classification.zip |Nootebook Python for classification}} |  Chap. 4 Kumar Book |  +|19.|  15.11 | Association Rule Mining: Apriori | {{ :magistraleinformatica:dmi:17_association_analysis.pdf |}} | Chap.5 Kumar Book |  Recording in Teams Channel  |   
-|21.|  04.11  09:00-10:45 Regression & Association Rule Mining| {{ :magistraleinformatica:dmi:16_linear_regression.pdf |}} {{ :magistraleinformatica:dmi:17_association_analysis.pdf |}}|  Regression: Appendix D in Kumar BOOK Chap.5 Association Rules: Kumar Book|  +|20.|  16.11 | Association Rule Mining: Evalaution and FP-Growth  {{ :magistraleinformatica:dmi:17_2023-fp-growth.pdf |}} | Chap.5 Kumar Book |  Recording in Teams Channel  
-|22.|  05.11  09:00-10:45 | Association Rule Mining| |  Chap.5 Association Rules: Kumar Book|  +|21.|  17.11 | Sequential Pattern Mining | {{ :magistraleinformatica:dmi:18_sequential_patterns_2023.pdf |}} | Chap.6 Kumar Book |  Recording in Teams Channel  
-|23.|  06.11  11:00-12:45 | Sequential Pattern Mining| {{ :magistraleinformatica:dmi:18_sequential_patterns_2020.pdf |}}|   Chap.6  Kumar Book|  +|22.|  22.11 | Sequential Pattern Miningtiming constraint. Time Series AnalysisSimilarities, Distances and Transformations| {{ :magistraleinformatica:dmi:22_time_series_similarity_2023.pdf |}} | [[https://cs.gmu.edu/~jessica/BookChapterTSMining.pdf |Overview on Time Series]]   Recording in Teams Channel  
-|24.|  11.11  09:00-10:45 | Ethics in AI & Privacy | {{ :magistraleinformatica:dmi:19_ethics_privacy.pdf |}} | [[https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai|Report in Trustworthy AI]] |  +|23.|  23.11  Time Series AnalysisShapelet Motif| {{ :magistraleinformatica:dmi:23_time_series_motif-shapelets2023.pdf |}} {{ :magistraleinformatica:dmi:shaplet.pdf |}} | Recording in Teams Channel   
-|25.|  12.11  09:00-10:45 | Ethics in AI Privacy  | {{ :dm:allegato1_chapter.pdf | Overview on Privacy}}  {{ :magistraleinformatica:dmi:allegato11-cpdp13.pdf |}}{{ :dm:capprivacy.pdf | Privacy by design}} |  +|24.|  24.11  Time Series AnalysisShapelet Motif; introduction to ethics and privacysame slides of the previous lecture and {{ :magistraleinformatica:dmi:19_ethics_privacy_2023_intro.pdf |}}  | {{ :magistraleinformatica:dmi:matrixprofile.pdf |}} [[https://www.cs.ucr.edu/~eamonn/MatrixProfile.html|Papers and resourse on motif]] |  Recording in Teams Channel 
-|26.|  13.11  11:00-12:45 | Ethics in AI Privacy, Explainability | {{ :magistraleinformatica:dmi:20_explainability_2020.pdf |}} | |  +|25.|  29.11 Python LabARM, SPM, Time series transformations  | {{ :magistraleinformatica:dmi:ar_spm.zip |}} {{ :magistraleinformatica:dmi:timeseries.zip |}} |  | Recording in Teams Channel  
-|27.|  18.11  09:00-10:45 | Explainability | {{ :magistraleinformatica:dmi:20_explainability_2020.pdf |}} | Material: [[https://arxiv.org/pdf/1805.10820.pdf|LORE]] [[https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdfLIME]]   [[http://delivery.acm.org/10.1145/3240000/3236009/a93-guidotti.pdf?ip=94.38.73.6&id=3236009&acc=OA&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2ED544636226B69D47&__acm__=1576196869_06b3353aae4fe3bd8ea30d9c9c5356eb|Survey]] {{ :magistraleinformatica:dmi:pkdd_2019_abele_cr.pdf |ABELE}}|  +|26.|  30.11 | Python LabTime series analysis  notebooks in the zip file of the previous lecture| | Recording in Teams Channel   | 
-|28.|  19.11  09:00-10:45 Anomaly Detection | {{ :magistraleinformatica:dmi:21_anomaly_detection_2020.pdf |}} | Chap. 9 of Kumar Book|  +|27.|  01.12 | Privacy in AI and Big Data Analytics  | {{ :magistraleinformatica:dmi:19_ethics_privacy2023.pdf |}} This set of slides include alse the introduction of the lecture 24.11.2023 |{{ :magistraleinformatica:dmi:chap-anonymity.pdf |}} {{ :magistraleinformatica:dmi:chap-anonymity.pdf |}} {{ :magistraleinformatica:dmi:prudence.pdf |}} {{ :magistraleinformatica:dmi:chapter-ppdm.pdf |}}| Recording in Teams Channel   
-|29.|  20.11  11:00-12:45 | Anomaly Detection | {{ :magistraleinformatica:dmi:anomalydetection.ipynb.zip |}} | Chap. 9 of Kumar Book |  +|28.|  06.12 Explainable AI | {{ :magistraleinformatica:dmi:20_explainability_2023.pdf |}}|{{ :magistraleinformatica:dmi:lore-tabular.pdf |}} {{ :magistraleinformatica:dmi:xai-survey.pdf |}} {{ :magistraleinformatica:dmi:imagexai.pdf |}} {{ :magistraleinformatica:dmi:timeseriesxai.pdf |}}| Recording in Teams Channel   
-|30.|  25.11  09:00-10:45 |Time series Siminarity   {{ :magistraleinformatica:dmi:22_time_series_similarity.pdf |}}| [[https://cs.gmu.edu/~jessica/BookChapterTSMining.pdf|Overview on DM for time series]], [[https://pdfs.semanticscholar.org/18f3/55d7ef4aa9f82bf5c00f84e46714efa5fd77.pdf|DTW paper by Sakoe and Chiba, 1978]]|  +|29.|  07.12 | Explainable AI | {{ :magistraleinformatica:dmi:21_anomaly_detection_2023.pdf |}} {{ :magistraleinformatica:dmi:anomaly_detection.zip |}}| | Recording in Teams Channel   
-|31.|  26.11  09:00-10:45 |Time series Clustering  |  {{ :magistraleinformatica:dmi:22_time_series_similarity.pdf |}}  | |  +|30.|  13.12 | Anomaly Detection | {{ :magistraleinformatica:dmi:21_anomaly_detection_2023.pdf |}} | | Recording in Teams Channel   
-|32.|  27.11  11:00-12:45 |Lab on Association Rules and Sequential Pattern Mining  | {{ :magistraleinformatica:dmi:patterns.zip |}} | |  +|31-32.|  14.12 9-11| Lab Python in AD + Lab Python in XAI| {{ :magistraleinformatica:dmi:anomaly_detection.zip |}}| | Recording in Teams Channel   
-|33.|  02.12  09:00-10:45   | |  +|33.|  15.12 9-11Lab Python in XAI + Paper Presentation| | |    
-|34.|  03.12  09:00-10:45 | |  | |  +|34.|  18.12 09-11| Paper Presentation| | |    
-  |  04.12  11:00-12:45 Lecture Canceled  | |  +|35.|  20.12 09-11| Paper Presentation| | |    
-|35.|  09.12  09:00-10:45 | Paper Presentation |  | |  +|36.|  21.12 09-11| Paper Presentation| | |    |
-|36.|  10.12  09:00-10:45 | Paper Presentation |  | |  +
-|37.|  11.12  11:00-12:45 | Paper Presentation |  | | +
  
 +  
 +====== Exams ======
 +**Project **
  
 +A project consists in data analyses based on the use of data mining tools. 
 +The project has to be performed by a team of 3 students. It has to be performed by using Python. The guidelines require to address specific tasks. Results must be reported in a unique paper. The total length of this paper must be max 25 pages of text including figures. The students must deliver both: paper (single column) and  well commented Python Notebooks.
  
 +  * First part of the project consists in the **assignments** described here: {{ :magistraleinformatica:dmi:project_description_dm23-pub.pdf | Project Description}}
 +  - **Dataset: {{ :magistraleinformatica:dmi:gun-data.zip | Dataset Files}}** 
 +  - **Deadline**: the fist part has to be delivered within <del> November 19th, 2023</del> November 26th, 2023. Send an email to: anna.monreale@unipi.it, lorenzo.mannocci@phd.unipi.it
 + 
 +  * Second part of the project consists in the assignment described here: {{ :magistraleinformatica:dmi:project_description_dm23-pub-updated.pdf |Updated Project Description}}
 +     - **Deadline**: Jan 8, 2024 
  
 +  * Third part of the project consists in the assignment described here: {{ :magistraleinformatica:dmi:project_description_dm23-pub-complete.pdf |Updated Project Description}}
 +   - **Deadline**:   Jan 8, 2024 
  
  
 +**Students who did not deliver the above project within **Jan 8, 2024** need to ask by email a new project to the teachers. The project that will be assigned will require about 20 days of work and after the delivery it will be discussed during the oral exam. **
  
 +** Paper Presentation (OPTIONAL)**
  
 +Students need to present a research paper (made available by the teacher) during the last week of the course. This presentation is OPTIONAL: Students that decide to do the paper presentation can avoid the oral exam with open questions on the entire program. They only need to present the project (see next point) and answer open question only on the topics which will not be covered by the project. The paper presentation can be done by the group or by a single person.
  
 +**Oral Exam**
 +  * **Project presentation** (with slides) – 10-15 minutes: mandatory for all the students with question fo understanding the details of any part of the project.
 +  * ** Open questions on the entire program **: for students who will not opt for paper presentation
 +  * ** Open questions on the topics which will not be covered by the project ** only for students opting  for paper presentation.
 +  * Group presentations of the project are preferred. If this is impossible please contact me for finding a solution.
  
 +**How to book for the exam colloquium? **
 + 
 +In https://esami.unipi.it/ you can find the dates for the exam: one for January and one for February. Each student must do the registration on one of the 2 dates. These are not the dates of the colloquium or project delivery but we will use the list of registered students for organizing the exam dates. After that deadline we will share with you a calendar for the oral exam.
  
- 
-====== Exams ====== 
-**Project** 
- 
-A project consists in data analyses based on the use of data mining tools.  
-The project has to be performed by a team of 2/3 students. It has to be performed by using Python. The guidelines require to address specific tasks. Results must be reported in a unique paper. The total length of this paper must be max 20 pages of text including figures. The students must deliver both: paper (single column) and  well commented Python Notebooks. 
- 
-  * First part of the project consists in the **assignments** described here: {{ :magistraleinformatica:dmi:dm-projectdescriptionpart1.pdf | Project Description}} 
-  * **Dataset:** {{ :magistraleinformatica:dmi:customer_supermarket.csv.zip |}}  
-  * **Deadline**: the fist part has to be delivered within  <del>November, 5th 2020.</del> ** November, 12 2020. ** 
-  * Second part of the project consists in the **assignment Task 3** described here: {{ :magistraleinformatica:dmi:project_description.pdf |Updated Project Description}} 
-  * **Deadline**: the second part has to be delivered within  ** December, 30 2020 or after (to be decided with students! ** 
- 
-**Paper Presentation (OPTIONAL)** 
- 
-Students need to present a research paper (made available by the teacher) during the last week of the course. This presentation is OPTIONAL: Students that decide to do the paper presentation can avoid the oral exam with open questions. They only need to present the project (see next point). 
- 
-**Oral Exam** 
-  * **Project presentation** (with slides) – 10 minutes: mandatory for all the students 
-  * ** Open questions ** on the entire program: optional only for students opting for paper presentation. 
-  
   
-====== Exam Dates ====== 
  
-TBD 
  
-===== Exam Sessions ===== 
-TBD 
  
 +====== Previous years =====
 +[[DM-INF 2022-2023]]
  
-===== Reading About the "Data Scientist" Job =====+[[DM-INF 2021-2022]]
  
-** ... a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the "sexiest" around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them. **+[[DM-INF 2020-2021]]
  
-//Data, data everywhereThe Economist, Special Report on Big Data, Feb2010.//+[[http://didawiki.cli.di.unipi.it/doku.php/dm/dm.2019-20|DM-2019/20]]
  
-  * Data, data everywhere. The Economist, Feb. 2010 {{:dm:economist--010.pdf|download}} 
-  * Data scientist: The hot new gig in tech, CNN & Fortune, Sept. 2011 [[http://tech.fortune.cnn.com/2011/09/06/data-scientist-the-hot-new-gig-in-tech/|link]] 
-  * Welcome to the yotta world. The Economist, Sept. 2011 {{:dm:economist-2012-dm.pdf|download}} 
-  * Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review, Sept 2012 [[http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/1|link]] 
-  * Il futuro è già scritto in Big Data. Il SOle 24 Ore, Sept 2012 [[http://www.ilsole24ore.com/art/tecnologie/2012-09-21/futuro-scritto-data-155044.shtml?uuid=AbOQCOhG|link]] 
-  * Special issue of Crossroads - The ACM Magazine for Students - on Big Data Analytics {{:dm:crossroadsxrds2012fall-dl.pdf|download}} 
-  * Peter Sondergaard, Gartner, Says Big Data Creates Big Jobs. Oct 22, 2012: [[https://www.youtube.com/watch?v=mXLy3nkXQVM|YouTube video]] 
  
-  * Towards Effective Decision-Making Through Data Visualization: Six World-Class Enterprises Show The Way. White paper at FusionCharts.com. [[http://www.fusioncharts.com/whitepapers/downloads/Towards-Effective-Decision-Making-Through-Data-Visualization-Six-World-Class-Enterprises-Show-The-Way.pdf|download]] 
- 
-====== Previous years ===== 
-[[http://didawiki.cli.di.unipi.it/doku.php/dm/dm.2019-20|DM-2019/20]] 
  
magistraleinformatica/dmi/start.1606505876.txt.gz · Ultima modifica: 27/11/2020 alle 19:37 (3 anni fa) da Anna Monreale