====== LABORATORY OF DATA SCIENCE (2020/2021) ======

Instructors:
  * Anna Monreale
    * KDD Laboratory, Università di Pisa
    *
    * anna [dot] monreale [at] unipi [dot] it
    * Office hours: Wednesday: 11:00-13:00 online using Teams (Appointment by email).
    * Telephone +39-050-2213119
  * Roberto Pellungrini
    * KDD Laboratory, Università di Pisa
    * roberto [dot] pellungrini [at] di [dot] unipi [dot] it
    * Office hours: Thursday 14:00-16:00, Online using Teams (Appointment by email).
    * Telephone +39-050-2212728

====== News =====
  * [10/11/2020] Instructions for the SSAS project in the Lecture of today: to avoid conflicts in deployment/process follow this steps once the solution is opened: (1) rename the project as <your account>_foodmart (2) from project properties select 'Deployment', then rename the database as <your account>_foodmart; (3) click on the button "show all files" just above "Solution explorer" right click on "view code" on the .database file that is visualized, and then change the ID from ruggieri_foodmart into <your account>_foodmart, and finally save the file; (4) change the credentials of connection to database on SQL Server. As an alternative solution you mayimport the project from the SSAS server and rename it as <your account>_foodmart (step 4 is still necessary). * [13/09/2020]: The lecture will be online. You can join the class by using this link: ====== Hours and Rooms ====== Classes Lessons will be held onilne by Teams Platform ^ Day of Week ^ Hour ^ Room ^ | Tuesday | 11:00 - 12:45 | Teams | | Thursday | 11:00 - 12:45 | Teams | Link to Teams module: ====== Learning Material ====== ===== Slides & Registration of the classes ===== * The slides used in the course will be inserted in the calendar after each class. * Registration of each lecture will be available on Teams ===== Past Exams ===== * 2016/17 text, 2015/16 text and 2015/16 solution, 2014/15 text and 2014/2015 solution, 2013/14 text, 2012/13 text and 2012/13 solution. ===== Software===== * Anaconda with Python 3.7 (Please, avoid Python 3.8) * SQL Server 2019 Developer Edition:SQL Server 2019 Management Studio. * For Data Tools we will publish instruction soon. * Microsoft Excel * Power BI Desktop ===== F.A.Q. ===== * Connection to wi-fi * F.A.Q.s about the labs ====== Class calendar - (2020-2021) ====== ^ ^ Day ^ Topic ^ Slides ^ Data/Software ^ References ^ Teacher | |1. | 15.09 11:00-12:45| Introduction. File data access. Representation formats: CSV, FLV, ARFF, XML| 2020-lds.01.introduction.pptx.pdf 2020-lds.02.bi_architectures.pptx.pdf 2020-lds.03.file_data_access.pptx.pdf| | - BI technology: An Overview of Business Intelligence Technology - File access: File System Interface - File Formats: Introduction to data technologies(Chps. 5, 6), Weka ARFF Format, XRFF Format | Monreale | |2. | 17.09 11:00-13:00 | Python Recap | 2020-lds.04.python.pptx.pdf | |Free Python book: |Pellungrini| |3. | 22.09 11:00-13:00 | File Access in Python | 2020-lds.05.fileaccess-python.pptx.pdf | Collection of files| |Pellungrini| |4. | 24.09 11:00-13:00 | Lab practice: XML2CSV/CSV2JSON file format conversion | | | |Pellungrini| |5. | 29.09 11:00-13:00 | Python Exercises | ex-customers.pdf || |Pellungrini| |6. | 01.10 11:00-13:00 | RDBMS access protocols: ODBC, OLE DB, JDBC. ODBC Programming. | lbi.06.relational_data_access-complete.pdf | | | Monreale| |7. | 06.10 11:00-13:00 | RDBMS access protocols: ODBC, OLE DB, JDBC. ODBC Programming. | lbi.06.relational_data_access-complete.pdf | | | Monreale| |8. | 08.10 11:00-13:00 | Stratified Sampling Ex, SQL Management Studio Demo, Project Explaination | lds.07.sqlserver.pdf| | | Monreale, Pellungrini| |9. | 13.10 11:00-13:00 | ETL, Sequel Server Data Tools Demo | lds.08.etlandssis.pdf | | |Pellungrini| |10. | 15.10 11:00-13:00 | SSIS exercises| | ex-midterm.pdf| |Pellungrini| |11. | 20.10 11:00-13:00 | stratified sampling with SSIS + SSIS practice | | | | Monreale/Pellungrini| |12. | 22.10 11:00-13:00 | SSIS practice + Project support | |2015midterm1text.pdf | | Monreale/Pellungrini| |13. | 27.10 11:00-13:00 | SSIS: Surrogate Keys | | | | Monreale/Pellungrini| |14. | 29.10 11:00-13:00 | SSIS: slowly changing dimensions | | | | Monreale/Pellungrini| |15. | 03.11 11:00-13:00 | Datawarehousing and OLAP recap. Data cubes, analytic SQL, and materialized views in SQL Server. | lds.09.dwandolap.pdf | | | Monreale/Pellungrini| |16. | 05.11 11:00-13:00 | OLAP with SQL Server Analysis Services (SSAS): data source views, dimensions, | lds.09.dwandolap.pdf lds.10.ssas.pdf | |1) SSAS (olap): documentation; 2) S. Harinath et al. Professional Microsoft SQL Server Analysis Services 2012 with MDX and DAX, Wrox publisher, 2012. Chps. 4-6. | Monreale/Pellungrini| |17. | 10.11 11:00-13:00 | OLAP with SQL Server Analysis Services (SSAS): dimensions, hierarchies. Data cubes, Parent-child hierarchies. OLAP explorative data analysis with Pivot Tables in Excel.| lds.10.ssas.pdf | Notice: Please read the instructions in the Section NEWS! | Pivot Tables in Excel: G. Harvey. Excel 2013 All-in-One For Dummies, 2013. Chp. VII-2. | Monreale/Pellungrini| |18. | 12.11 11:00-13:00 | OLAP explorative data analysis with Pivot Tables in Excel.| | foodmartexplorative.xlsx | | Monreale/Pellungrini| |19. | 17.11 11:00-13:00 | Introduction to MDX | | | MDX: 1) documentation and a useful guide on ordering; 2) S. Harinath ed al. Professional Microsoft SQL Server Analysis Services 2012 with MDX and DAX, Wrox publisher, 2012. Chp. 3. | Monreale/Pellungrini| |20. | 19.11 11:00-13:00 | Introduction to MDX | | mdx-ex.pdf| | Monreale | |21. | 24.11 11:00-13:00 | Practice on MDX | | | | Pellungrini| |22. | 25.11 11:00-13:00 | project check | |Please do exercize on MDX. Here: ex-mdx.pdf you can find other queries that we will solve during the next lectures | | Monreale/Pellungrini | |23. | 01.12 11:00-13:00 | Microstrategy presentation | | | | Monreale/Pellungrini| |24. | 03.12 11:00-13:00 | PowerBI Desktop + Correction Ex. MDX | lds.12.powerbi.pdf || | Monreale/Pellungrini | |25. | 10.12 11:00-13:00 | Microstrategy Viz| | If you nee the password please check the common chat in Temas, or wirte an email to the teachear.| | Monreale/Pellungrini| ====== Exams ====== PROJECT A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting. The project has to be performed by a team of 2 students (at most 3 after asking authorization for that to the teachers). * First part of the project consists in the assignments described here: Project Description Part 1 * Second part of the project cosist in the assignments described here: Project Description Part 2 * Third part of the project cosist in the assignments described here: Project Description Part 3 * Remember to re-submit all three parts of the project with your third part, as specified in the document above. * Dataset: * Deadline: the fist part has to be delivered within November, 18th 2020. * Deadline: the second part has to be delivered within December, 4th 2020. * Deadline: the third part has to be delivered within December, 31st 2020. ===== Mid-term exams ===== ===== Exam sessions ===== ^ Session ^ Date ^ Time ^ Room ^ Notes ^ Marks ^ =====Extra sessions A.A. 2019/20===== ^ Date ^ Time ^ Room ^ Notes ^ Results ^ | | | | | | =====Past Editions ===== * LABORATORY OF DATA SCIENCE (2019/2020) * LABORATORY OF DATA SCIENCE (2018/2019) * BUSINESS INTELLIGENCE LAB (2017/2018) * LBI 2016/2017

