Questa è una vecchia versione del documento!
WARNING: All lectures of the First Semester of the academic year 2020/21, until 31/12/2020, will be provided exclusively remotely, through the Teams team named “599AA 20/21 - BIG DATA ANALYTICS [WDS-LM]” (https://bit.ly/35yJ65c).
ATTENZIONE: Tutte le lezioni frontali del Primo Semestre dell’a.a. 2020/21, fino al 31/12/2020, verranno erogate esclusivamente in modalità a distanza, attraverso il canale Teams “599AA 20/21 - BIG DATA ANALYTICS [WDS-LM]” (https://bit.ly/35yJ65c).
Instructors - Docenti:
Timetable (http://bit.ly/unipi_timetable_2020)
Team Registration: build up teams of 3 or 4 students and register your team here, by September 27th: https://forms.gle/rbsV4dF6RuAnCBWz9
For students without a team: send an email to Luca Pappalardo to notify that you are without a team by September 30th.
Only for the registered teams, express your preference for the datasets by September 30th https://forms.gle/HVheaScCgQJw4o616
Dataset assignment: at thie following link, each team can find the dataset assigned for the project –> https://bit.ly/33eTfC9
Instructions for mid term 1: The first mid term presentation (data understanding and project proposal) will be on October 19th (BigProblem, Global, MMG, I TeamIDI) and October 20th (Bei Dati Acrobatici, Malucs, AMS Group).
midterm1_teamname_type
, where teamname
is the name of the team (do not use spaces, use lowercase only), type
is the type of the file (i.e., presentation, report, or code). Examples: midterm1_iteamidi_presentation.pdf
, midterm1_beidatiacrobatici_report.zip
, midterm1_amsgroup_code.ipynb
Instructions for mid term 2: The second mid term presentation (model(s) implementation and evaluation) will be on November 16th (BigProblem, Global, MMG, I TeamIDI) and November 17th (Bei Dati Acrobatici, Malucs, AMS Group).
midterm2_teamname_type
, where teamname
is the name of the team (do not use spaces, use lowercase only), type
is the type of the file (i.e., presentation, report, or code). Examples: midterm2_iteamidi_presentation.pdf
, midterm2_beidatiacrobatici_report.zip
, midterm2_amsgroup_code.ipynb
Paper presentation:
Examples of projects from past years:
In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives:
In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:
This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:
During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed.
14/09 (Mod. 1) Introduction to the course, The Big Data scenario lesson1_introduction_to_the_course_bda2021.pdf
15/09 (Mod. 2) Python for Data Science and the Jupyter Notebook: developing open-source and reproducible data science
21/09 No Lesson (Election Day in Italy)
22/09 (Mod. 3) Presentation of datasets for projects bda20_21_datasets_1_.pdf
28/09 (Mod. 2) Scikit-learn: programming tools for data mining (part 1): http://bit.ly/bda_notebooks_2
29/09
05/10 No Lesson (SocInfo2020 conference)
06/10 No Lesson (SocInfo2020 conference)
12/10 (Mod. 2) Geopandas and scikit-mobility: managing geographic data in Python (part 1) bda2021_geopandas.zip
13/10 (Mod. 2) Geopandas and scikit-mobility: managing geographic data in Python (part 2) https://github.com/scikit-mobility/tutorials/tree/master/mda_masterbd2020
19/10 (Mod. 3) 1st Mid Term - first group of teams
20/10 (Mod. 3) 1st Mid Term - second group of teams
26/10 (Mod. 3) Discussion and group working on projects
27/10 (Mod. 3) Discussion and group working on projects
02/11 (Mod. 1) Nowcasting well-being with big data bda_wellbeing.pdf
03/11 (Mod. 1) Injury prediction in sports with AI bda_2020_injury_forecasting.pdf
06/11 (Mod. 1) Trustworthy data mining
16/11 (Mod. 3) 2nd Mid Term - first group of teams
17/11 (Mod. 3) 2nd Mid Term - second group of teams
23/11 (Mod. 3) Paper presentations
24/11 (Mod. 3) Paper presentations
30/11 (Mod. 3) Discussion and group working on projects
01/12 (Mod. 3) 3rd Mid Term - first group of teams
07/12 (Mod. 3) 3rd Mid Term - second group of teams
TBC