bigdataanalytics:bda:start
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
bigdataanalytics:bda:start [04/11/2020 alle 18:22 (4 anni fa)] – [Big Data Analytics A.A. 2020/21] Luca Pappalardo | bigdataanalytics:bda:start [04/11/2022 alle 12:21 (23 mesi fa)] (versione attuale) – Salvatore Ruggieri | ||
---|---|---|---|
Linea 1: | Linea 1: | ||
- | < | + | ====== Big Data Analytics A.A. 2022/23 ====== |
- | <!-- Google Analytics --> | + | |
- | <script type=" | + | |
- | (function(i, | + | |
- | (i[r].q=i[r].q||[]).push(arguments)}, | + | |
- | m=s.getElementsByTagName(o)[0]; | + | |
- | })(window, | + | |
- | ga(' | + | This year, the course 599AA Big Data Analytics |
- | ga(' | + | |
- | ga(' | + | |
- | + | ||
- | ga(' | + | |
- | ga(' | + | |
- | setTimeout(" | + | |
- | </ | + | ====== |
- | <!-- End Google Analytics --> | + | |
- | <!-- Capture clicks --> | + | |
- | < | + | |
- | jQuery(document).ready(function(){ | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | }); | + | |
- | </ | + | |
- | </ | + | |
- | ====== Big Data Analytics | + | |
- | **WARNING**: | + | [[bigdataanalytics:bda: |
- | **ATTENZIONE**: | + | [[bigdataanalytics: |
- | + | ||
- | + | ||
- | Instructors - Docenti: | + | |
- | * **Luca Pappalardo, Fosca Giannotti** | + | |
- | * KDD Laboratory, Università di Pisa and ISTI-CNR, Pisa | + | |
- | * [[http:// | + | |
- | * [[luca.pappalardo@isti.cnr.it]] | + | |
- | * [[fosca.giannotti@isti.cnr.it]] | + | |
- | + | ||
- | Timetable (http:// | + | |
- | * Monday 16:15 - 18:00 Aula WDS/1 | + | |
- | * Tuesday 16:15 - 18:00 Aula WDS/1 | + | |
- | + | ||
- | Team Registration: | + | |
- | + | ||
- | __**For students without a team**__: send an email to Luca Pappalardo to notify that you are without a team by September 30th. | + | |
- | + | ||
- | **__Only for the registered teams__**, express your preference for the datasets by September 30th https:// | + | |
- | + | ||
- | __**Dataset assignment**__: | + | |
- | + | ||
- | **Instructions for mid term 1**: The first mid term presentation (data understanding and project proposal) will be on October 19th (BigProblem, | + | |
- | + | ||
- | * // | + | |
- | * //report//: the report must be done in latex, using this template: {{ :bigdataanalytics: | + | |
- | * //code//: the python code in .ipynb format (Jupyter Notebook) or .py format used to generate the computations and the plots must be sent through the google form. Please document adequately your notebooks using the markdown language. | + | |
- | * //Google form//: upload the material __**by October 18th**__ using this form: https:// | + | |
- | * name the files using the format '' | + | |
- | + | ||
- | + | ||
- | **Instructions for mid term 2**: The second mid term presentation (model(s) implementation and evaluation) will be on November 16th (BigProblem, | + | |
- | + | ||
- | * // | + | |
- | * //report//: the report must be done in latex, using the same template as the first mid term: {{ : | + | |
- | * //code//: the python code in .ipynb format (Jupyter Notebook) or .py format used to generate the computations and the plots must be sent through the google form. Please document adequately your notebooks using the markdown language. | + | |
- | * //Google form//: upload the material __**by November 15th**__ using this form: https:// | + | |
- | * name the files using the format '' | + | |
- | + | ||
- | **Paper presentation: | + | |
- | + | ||
- | * each student will present, during a talk of most **7 minutes**, a paper on Big Data Analytics. The presentations of the papers are scheduled on __November 23rd and 24th__. The presentation should last 7 minutes (+ 3 minutes questions). | + | |
- | * Express your preference for 5 papers here: https:// | + | |
- | * During the presentation (with slides) you should highlight the following aspects: the data set used, the feature engineering and/or selection (if any), the problem addressed, the models/ | + | |
- | * __**The paper assigned to each student, and the date of presentation, | + | |
- | + | ||
- | + | ||
- | Examples of projects from past years: | + | |
- | * Credit Risk Prediction, final report: {{ : | + | |
- | * Ted Talks, final report: {{ : | + | |
- | + | ||
- | ====== Learning goals ====== | + | |
- | + | ||
- | In our digital society, every human activity is mediated by information technologies, | + | |
- | This course has three objectives: | + | |
- | + | ||
- | * introducing to the emergent field of big data analytics and social mining; | + | |
- | * introducing to the technological scenario of big data, like programming tools to analyze big data, query NoSQL databases, and perform predictive modeling; | + | |
- | * guide students to the development of a open-source and reproducible big data analytics project, based on the analyis of real-world datasets. | + | |
- | + | ||
- | ====== Module 1: Big Data Analytics and Social Mining ====== | + | |
- | In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics: | + | |
- | + | ||
- | * The Big Data Scenario and the new questions to be answered | + | |
- | * Sport Analytics: | + | |
- | - Soccer data landscape and injury prediction | + | |
- | - Analysis and evolution of sports performance | + | |
- | * Mobility Analytics | + | |
- | - Mobility data landscape and mobility data mining methods | + | |
- | - Understanding Human Mobility with vehicular sensors (GPS) | + | |
- | - Mobility Analytics: Novel Demography with mobile-phone data | + | |
- | * Social Media Mining | + | |
- | - The social media data landscape: Facebook, Linked-in, Twitter, Last_FM | + | |
- | - Sentiment analysis. example from human migration studies | + | |
- | - Discussion on ethical issues of Big Data Analytics | + | |
- | * Well-being& | + | |
- | - Nowcasting influenza with retail market data | + | |
- | - Predicting well-being from human mobility patterns | + | |
- | * Paper presentations by students | + | |
- | + | ||
- | + | ||
- | ====== Module 2: Big Data Analytics Technologies ====== | + | |
- | This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented: | + | |
- | + | ||
- | * Python for Data Science | + | |
- | * The Jupyter Notebook: developing open-source and reproducible data science | + | |
- | * MongoDB: fast querying and aggregation in NoSQL databases | + | |
- | * GeoPandas: analyze geo-spatial data with Python | + | |
- | * Scikit-learn: | + | |
- | * Keras: deep learning in Python | + | |
- | + | ||
- | + | ||
- | ====== Module 3: Laboratory for Interactive Project Development | + | |
- | During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed. | + | |
- | + | ||
- | * 1st Mid Term: Data Understanding and Project Formulation | + | |
- | * 2nd Mid Term: Model(s) construction and evaluation | + | |
- | * 3rd Mid Term: Model interpretation/ | + | |
- | * Exam: Final Project results | + | |
- | + | ||
- | ====== Calendar ====== | + | |
- | + | ||
- | 14/09 (Mod. 1) Introduction to the course, The Big Data scenario {{ : | + | |
- | + | ||
- | 15/09 (Mod. 2) Python for Data Science and the Jupyter Notebook: developing open-source and reproducible data science | + | |
- | * How to install Jupyter notebook: https:// | + | |
- | * Python notebooks: http:// | + | |
- | + | ||
- | 21/09 No Lesson (Election Day in Italy) | + | |
- | + | ||
- | 22/09 (Mod. 3) Presentation of datasets for projects {{ : | + | |
- | + | ||
- | 28/09 (Mod. 2) Scikit-learn: | + | |
- | + | ||
- | 29/09 | + | |
- | * (Mod. 2) Scikit-learn: | + | |
- | * (Mod. 1) Reproducing and Explaining Human Evaluations of Soccer Performance with Artificial Intelligence {{ : | + | |
- | + | ||
- | 05/10 No Lesson (SocInfo2020 conference) | + | |
- | + | ||
- | 06/10 No Lesson (SocInfo2020 conference) | + | |
- | + | ||
- | 12/10 (Mod. 2) Geopandas and scikit-mobility: | + | |
- | + | ||
- | 13/10 (Mod. 2) Geopandas and scikit-mobility: | + | |
- | + | ||
- | 19/10 (Mod. 3) **1st Mid Term** - first group of teams | + | |
- | + | ||
- | 20/10 (Mod. 3) **1st Mid Term** - second group of teams | + | |
- | + | ||
- | 26/10 (Mod. 3) // | + | |
- | + | ||
- | 27/10 (Mod. 3) // | + | |
- | + | ||
- | 02/11 (Mod. 1) Nowcasting well-being with big data {{ : | + | |
- | + | ||
- | 03/11 (Mod. 1) Injury prediction in sports with AI {{ : | + | |
- | + | ||
- | 06/11 (Mod. 1) Trustworthy data mining | + | |
- | + | ||
- | 16/11 (Mod. 3) **2nd Mid Term** - first group of teams | + | |
- | + | ||
- | 17/11 (Mod. 3) **2nd Mid Term** - second group of teams | + | |
- | + | ||
- | 23/11 (Mod. 3) Paper presentations | + | |
- | + | ||
- | 24/11 (Mod. 3) Paper presentations | + | |
- | + | ||
- | 30/11 (Mod. 3) // | + | |
- | + | ||
- | 01/12 (Mod. 3) **3rd Mid Term** - first group of teams | + | |
- | + | ||
- | 07/12 (Mod. 3) **3rd Mid Term** - second group of teams | + | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | ===== Exam ===== | + | |
- | TBC | + | |
- | + | ||
- | ====== Previous Big Data Analytics websites ====== | + | |
[[bigdataanalytics: | [[bigdataanalytics: |
bigdataanalytics/bda/start.1604514123.txt.gz · Ultima modifica: 04/11/2020 alle 18:22 (4 anni fa) da Luca Pappalardo