====== 783AA Geospatial Analytics A.A. 2022/23 ====== **[WARNING]**: This course substitutes "Big Data Analytics" from the academic year 2022/23 on. ===Instructors:=== * **Luca Pappalardo** * [[luca.pappalardo@isti.cnr.it]] * KDD Laboratory, ISTI-CNR, Pisa * [[http://www-kdd.isti.cnr.it]] * **Mirco Nanni** * [[mirco.nanni@isti.cnr.it]] * KDD Laboratory, ISTI-CNR, Pisa * [[http://www-kdd.isti.cnr.it]] ===Tutors:=== * **Giuliano Cornacchia**, PhD student, University of Pisa * **Giovanni Mauro**, PhD student, University of Pisa * **Daniele Gambetta**, PhD student, University of Pisa ===== Hours and Rooms ===== ^ Day of Week ^ Hour ^ Room ^ | Thursday | 11:00 - 13:00 | Room Fib E | | Friday | 09:00 - 11:00 | Room Fib C1 | * Beginning of lectures: 15 September 2022 * End of lectures: 2 December 2022 * Possible lessons recovered: 5–16 December 2022 __**The lectures will be only in presence and will NOT be live-streamed**__ A Telegram channel will be used to post news and other stuff related to the course: {{:geospatialanalytics:gsa:qr_tmp.png?100|}} ===== NEWS ===== * Master theses available on Geospatial Analytics. Contact the professors for further information. * Exam dates: January 27th, 2023 and February 24th, 2023, in aula Faedo at [[https://goo.gl/maps/sLegq2cHNpowefSi8 | ISTI-CNR ]], Pisa. * **Exam instructions**: * each student will present the project, i.e., submitted notebook(s) (in max. 15 minutes) * during/after the presentation, we'll ask some questions about the project * and some questions about the course in general (mainly about the topic related to the project) * (for the computationally expensive code blocks, please pre-run them and show the outputs only). * {{ :geospatialanalytics:gsa:projects.pdf |List of projects}} * {{ :geospatialanalytics:gsa:projects_assignment.pdf | Projects assignment}} * {{ :geospatialanalytics:gsa:valutazioni_esercizi_-_release_1.x_2.x.pdf |Evaluation of homeworks 1.x and 2.x}} * {{ :geospatialanalytics:gsa:valutazioni_esercizi_-_release_3.x.pdf |Evaluation of homeworks 3.x}} * {{ :geospatialanalytics:gsa:valutazioni_esercizi_-_release_4.x_6.x_8.x.pdf | Evaluation of homeworks 4.x, 6.x, and 8.x}} * {{ :geospatialanalytics:gsa:valutazioni_esercizi_-_release_5.x_7.x.pdf | Evaluation of homeworks 5.x and 7.x}} * {{ :geospatialanalytics:gsa:valutazioni_esercizi_-_release_9.x_10.x.pdf | Evaluation of homeworks 9.x and 10.x}} ====== Learning goals ====== The analysis of geographic information, such as those describing human movements, is crucial due to its impact on several aspects of our society, such as disease spreading (e.g., the COVID-19 pandemic), urban planning, well-being, pollution, and more. This course will teach the fundamental concepts and techniques underlying the analysis of geographic and mobility data, presenting data sources (e.g., mobile phone records, GPS traces, geotagged social media posts), data preprocessing techniques, statistical patterns, predicting and generative algorithms, and real-world applications (e.g., diffusion of epidemics, socio-demographics, link prediction in social networks). The course will also provide a practical perspective through the use of advanced geoanalytics Python libraries. The assessment of the course consists of: (1) an oral exam, aimed to test the knowledge acquired by the student during the course; (2) exercises to be done during the course; (3) the development of a project to test the practical ability acquired during the course. Topics: * Spatial Reference Systems * Data formats * Trajectory and Flows * Spatial Tessellations * Open-source tools for geospatial analysis * Digital spatial and mobility data * Preprocessing mobility data * Privacy issues in mobility data * individual and collective mobility laws * Next-location and flow prediction * Trajectory and flow generation * Applications ===== Module 1: Spatial and Mobility Data ===== * Fundamentals of Geographical Information Systems * Geographic coordinates systems * Vector data model * Trajectories * Spatial Tessellations * Flows * **Practice**: Python packages for geospatial analysis (Shapely, GeoPandas, folium, scikit-mobility) * Digital spatial and mobility data * Mobile Phone Data * GPS data * Social media data * Other data (POIs, Road Networks, etc.) * **Practice**: reading and exploring spatial and mobility datasets in Python * Preprocessing mobility data * filtering compression * stop detection * trajectory segmentation * trajectory similarity and clustering * **Practice**: data preprocessing with scikit-mobility ===== Module 2: Mobility Patterns and Laws ===== * individual mobility laws/patterns * collective mobility laws/patterns * Practice: analyze mobility data with Python ===== Module 3: Predictive and Generative Models ===== * Prediction * Next-location prediction * Crowd flow prediction * Spatial interpolation * Generation * Trajectory generation * Flow generation * Practice: mobility prediction and generation in Python ===== Module 4: Applications ===== * Epidemic spreading (COVID-19) * Urban segregation models * Routing and navigation apps * Traffic simulation with SUMO ====== Calendar ====== ^ ^ Day ^ Topic ^ Slides/Code ^ Material ^ Teacher| |1. |15.09 09:00-11:00| Introduction to the Course | **[slides]** {{ :geospatialanalytics:gsa:lesson_00_-_about_the_course.pdf | Course modalities}}; **[slides]** {{ :geospatialanalytics:gsa:lesson_01_-_introduction.pdf | Intro to Geospatial Analytics}} | **[book chapter]** [[ https://www.amazon.it/Introduction-geographic-information-systems-Kang-Tsung/dp/0078095131 | Introduction to geographic information systems]], Chapter 1; **[paper]** [[https://arxiv.org/pdf/1710.00004.pdf | Human Mobility: Models and Applications]], Section 1| Pappalardo, Nanni | |2. |16.09 14:00-16:00| Fundamental Concepts - part 1| **[slides]** {{ :geospatialanalytics:gsa:lesson_02_-_fundamental_concepts.pdf | Fundamental concepts}} | **[book chapter]** [[ https://www.amazon.it/Introduction-geographic-information-systems-Kang-Tsung/dp/0078095131 | Introduction to geographic information systems]], Chapter 2 (Coordinate Systems); **[paper]** [[https://arxiv.org/abs/2012.02825 | A survey of deep learning for human mobility]], Section 2.1, Appendix A; [[https://saylordotorg.github.io/text_essentials-of-geographic-information-systems/s08-02-vector-data-models.html | Essentials of Geographic Information Systems,Chapter 4, Section 4.2 (Vector Data Models)]]; **[video]** [[https://www.youtube.com/watch?v=HnWNhyxyUHg | Intro to coordinate systems and UTM projection]] | Pappalardo | |3. |22.09 11:00-13:00| Fundamental Concepts - part 2 (practice)| **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson2_fundamental_concepts | Introduction to shapely, geopandas, folium, and scikit-mobility]] | **[book chapter]** [[ https://autogis-site.readthedocs.io/en/latest/notebooks/L1/geometric-objects.html | Automating GIS-processes, Lesson 1 (Shapely and geometric objects)]]; **[article]** [[ https://www.learndatasci.com/tutorials/geospatial-data-python-geopandas-shapely/ | Analyze Geospatial Data in Python: GeoPandas and Shapely]]; **[paper]** [[https://www.jstatsoft.org/article/view/v103i04 | scikit-mobility: a Python library for the Analysis, Generation, and Risk Assessment of Mobility Data]], Sections 1, 2; | Pappalardo | |4. |23.09 09:00-11:00| Geographic and Mobility data - part 1 | **[slides]** {{ :geospatialanalytics:gsa:lesson_03_-_spatial_and_mobility_data.pdf | Geospatial and Mobility data}} | **[paper]** [[ https://arxiv.org/abs/2012.02825 | A survey of deep learning for human mobility ]], Appendix C.1, C.2, C.3; **[paper]** [[ https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-021-00284-9 | Evaluation of home detection algorithms on mobile phone data using individual-level ground truth ]], Sections "Introduction" and "Mobile phone datasets"; **[paper]** [[ https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-015-0046-0 | A survey of results on mobile phone datasets analysis ]], Sections "Introduction", "Adding space - geographical networks"; **[paper]** [[ https://www.kdd.org/exploration_files/June_2019_-_1._Urban_Human_Mobility,_Data_Drive_Modeling_and_Prediction_.pdf | Urban Human Mobility: Data-Driven Modeling and Prediction]], Section 2.2; | Pappalardo | |5. |29.09 11:00-13:00| Geographic and Mobility data - part 2 (practice) | **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson3_spatial_and_mobility_data | Geospatial and Mobility data in Python]] | **[paper]** [[https://www.jstatsoft.org/article/view/v103i04 | scikit-mobility: a Python library for the Analysis, Generation, and Risk Assessment of Mobility Data]], Section 4; **[video]** [[ https://www.youtube.com/watch?v=FjJZsaHHuvw | scikit-mobility data module]]; **[tutorial]** [[https://geoffboeing.com/2016/11/osmnx-python-street-networks/| OSMnx: Python for Street Networks]]; **[paper]** [[ https://www.sciencedirect.com/science/article/pii/S0198971516303970?via%3Dihub | OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks]]; **[book chapter]** [[ https://automating-gis-processes.github.io/CSC/notebooks/L3/retrieve_osm_data.html | Intro to Python GIS, Retrieving OpenStreetMap data ]]; | Pappalardo | |6. |06/10 11:00-13:00| Data preprocessing - part 1 | **[slides]** {{ :geospatialanalytics:gsa:lesson_04_-_preprocessing.pdf |Trajectory preprocessing}} | **[paper]** [[https://journals.sagepub.com/doi/pdf/10.1177/15501477211050729?download=true|Review and classification of trajectory summarisation algorithms: From compression to segmentation]]; **[paper]** [[http://www2.ipcku.kansai-u.ac.jp/~yasumuro/M_InfoMedia/paper/Douglas73.pdf|Algorithms for the reduction of the number of points required to represent a digitized line or its caricature (Douglas-Peucker)]]; **[paper]** [[https://dl.acm.org/doi/pdf/10.1145/1810959.1811019|Approximating the Fréchet Distance for Realistic Curves in Near Linear Time (Driemel–HarPeled–Wenk)]]; **[paper]** [[https://www.researchgate.net/publication/314207447_A_Trajectory_Segmentation_Map-Matching_Approach_for_Large-Scale_High-Resolution_GPS_Data|A Trajectory Segmentation Map-Matching Approach for Large-Scale, High-Resolution GPS Data]]; **[paper]** [[https://www.ismll.uni-hildesheim.de/lehre/semSpatial-10s/script/6.pdf|Hidden Markov Map Matching Through Noise and Sparseness]] | Nanni | |7. |06/10 09:00-11:00| Data preprocessing - part 2 | **[slides]** {{ :geospatialanalytics:gsa:lesson_04-part2_-_preprocessing.pdf |Semantic Enrichment}} | **[paper]** [[https://eprints.gla.ac.uk/128784/1/128784.pdf|Analysis of human mobility patterns from GPS trajectories and contextual information]]; **[paper]** [[https://www.researchgate.net/publication/233197970_Using_Mobile_Positioning_Data_to_Model_Locations_Meaningful_to_Users_of_Mobile_Phones|Using Mobile Positioning Data to Model Locations Meaningful to Users of Mobile Phones]]; **[paper]** [[https://www.pnas.org/doi/10.1073/pnas.1408439111|Dynamic population mapping using mobile phone data]]; **[paper]** [[https://dl.acm.org/doi/abs/10.1145/2505821.2505830|Inferring human activities from GPS tracks]]| Nanni | |8. |13/10 11:00-13:00| Data preprocessing - part 3 (practice) | **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson4_preprocessing | Preprocessing Mobility data]] | **[paper]** [[https://www.jstatsoft.org/article/view/v103i04 | scikit-mobility: a Python library for the Analysis, Generation, and Risk Assessment of Mobility Data]], Section 3; | Cornacchia | |9. |14/10 09:00-11:00| Individual Human Mobility Laws and Models - part 1| **[slides]** {{ :geospatialanalytics:gsa:lesson_05_-_individual_models.pdf | Individual mobility laws and models}} | **[paper]** [[ https://www.nature.com/articles/nature04292 | The scaling laws of human travel]]; **[paper]** [[ https://www.nature.com/articles/nature06958 | Understanding individual human mobility patterns]]; **[paper]** [[https://arxiv.org/pdf/1710.00004.pdf | Human Mobility: Models and Applications]], Sections 3.1 and 4; **[paper]** [[ https://www.nature.com/articles/ncomms9166 | Returners and Explorers dichotomy in Human Mobility]]; **[paper]** [[ https://barabasi.com/f/310.pdf | Limits of predictability in human mobility]]; **[paper]** [[ https://www.nature.com/articles/nphys1760 | Modelling the scaling properties of human mobility]]; | Pappalardo | |10. |20/10 11:00-13:00| Individual Human Mobility Laws and Models - part 2 (practice)| **[code]**[[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson5_mobilitylaws_and_models | Mobility laws and models]] | [[https://scikit-mobility.github.io/scikit-mobility/reference/measures.html | scikit-mobility documentation: measures]], [[https://scikit-mobility.github.io/scikit-mobility/reference/models.html | scikit-mobility documentation: models]] | Cornacchia | |11. |21/10 09:00-11:00| Mobility Patterns | **[slides]** {{ :geospatialanalytics:gsa:lesson_06_-_mobility_patterns.pdf |Mobility Patterns}} | **[paper]** [[https://dl.acm.org/doi/10.1145/3440207|A Survey on Trajectory Data Management, Analytics, and Learning]], Section 3; **[paper]** [[https://faculty.ist.psu.edu/jessieli/Publications/VLDB10-ZLi-Swarm.pdf|Swarm: Mining Relaxed Temporal Moving Object Clusters]]; **[paper]** [[https://dl.acm.org/doi/10.1145/1183471.1183479|Computing longest duration flocks in trajectory data]]; **[paper]** [[https://dl.acm.org/doi/10.1145/1281192.1281230|Trajectory pattern mining]]; **[paper]** [[https://www.researchgate.net/publication/225140109_On_Discovering_Moving_Clusters_in_Spatio-temporal_Data|On Discovering Moving Clusters in Spatio-temporal Data]] | Nanni | |12. |27/10 11:00-13:00| Collective Mobility Laws and Models - part 1 | **[slides]** {{ :geospatialanalytics:gsa:lesson_07_-_collective_models.pdf | Collective mobility laws and models}} | **[paper]** [[ https://arxiv.org/abs/1710.00004 |Human Mobility: Models and Applications, Section 4.2]]; **[paper]** [[ https://www.jstor.org/stable/2087063|The P1 P2/D Hypothesis: On the Intercity Movement of Persons]]; **[paper]** [[https://www.jstor.org/stable/2084520|Intervening Opportunities: A Theory Relating Mobility and Distance]]; **[paper]** [[https://www.nature.com/articles/nature10856|A universal model for mobility and migration patterns]]; **[paper]** [[https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0037027|A tale of many cities: universal patterns in human urban mobility]]; **[paper]** [[https://arxiv.org/abs/1506.04889|Systematic comparison of trip distribution laws and models]]: **[paper]** [[https://www.nature.com/articles/s41467-021-26752-4|A Deep Gravity model for mobility flows generation]] | Pappalardo | |13. |28/10 09:00-11:00| Guest lecture | Joys and sorrows of mobile phone records | | [[http://leoferres.info/ | Leo Ferres]] | |14. |03/11 11:00-13:00| Next-Location Prediction | **[slides]** {{ :geospatialanalytics:gsa:lesson_08_-_location_prediction.pdf |Location prediction}} **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson8_location_prediction | Markov-chain-based location prediction]] | [[https://hmmlearn.readthedocs.io/en/latest/|HMMlearn library]]; **[paper]** [[https://ieeexplore.ieee.org/document/8570749|Mobility Prediction: A Survey on State-of-the-Art Schemes and Future Applications]]; **[paper]** [[https://ieeexplore.ieee.org/document/9756903|A Survey on Trajectory-Prediction Methods for Autonomous Driving]], Sections IV and V; **[book chapter]** [[https://web.stanford.edu/~jurafsky/slp3/A.pdf|Speech and Language Processing]], Chapter A - Hidden Markov Models; **[paper]** {{ :geospatialanalytics:gsa:mcleod_1996_do_fielders_know_where_to_go_to_catch_the_ball_or_only_how_to_get_there.pdf |Do Fielders Know Where to Go to Catch the Ball...?}} | Nanni | |15. |04/11 09:00-11:00| Collective Mobility Laws and Models - part 2 (practice) | **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson7_collectivelaws_and_models|Flow generation]] | [[https://scikit-mobility.github.io/scikit-mobility/reference/models.html#module-skmob.models.gravity|scikit-mobility documentation: Gravity]]; [[https://scikit-mobility.github.io/scikit-mobility/reference/models.html#module-skmob.models.radiation|scikit-mobility documentation: Radiation ]] | Pappalardo | |16. |10/11 11:00-13:00| Spatial segregation models - part 1 | **[slides]** {{ :geospatialanalytics:gsa:lesson_09_-_segregation.pdf | Segregation models}} | **[paper]** [[https://www.tandfonline.com/doi/abs/10.1080/0022250X.1971.9989794 |Dynamic models of segregation, Schelling]]; **[paper]** [[https://www.sciencedirect.com/science/article/pii/S016726810700131X?casa_token=ty8IeJ02MJoAAAAA:fHKNYiVIob_xXq9aWtLU-djJPPyzb6Jwm--nzReSGIy5k-ekAtkkakjgnPVagnDuYyXJLT66pQ|Segregation in networks]]; **[paper]** [[https://www.nature.com/articles/s41467-021-24899-8|Mobility patterns are associated with experienced income segregation in large US cities]]; | Mauro | |17. |11/11 09:00-11:00| Spatial segregation models - part 2 (practice) | **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson9_segregation|Implementing the Schelling model with MESA]] | **[tutorial]** [[https://mesa.readthedocs.io/en/latest/tutorials/intro_tutorial.html|Introduction to MESA]] | Mauro, Gambetta | |18. |17/11 11:00-13:00| Traffic Simulation with SUMO | **[slides]** {{ :geospatialanalytics:gsa:lesson_10_-_traffic_simulation_with_sumo.pdf | Traffic simulation with SUMO}}; **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson10_sumo|Traffic simulation with SUMO]] | | Cornacchia | |19. |18/11 09:00-11:00| Routing on road networks | **[code]** [[https://github.com/jonpappalord/geospatial_analytics/tree/main/lesson10_sumo|Routing on road networks]] | | Cornacchia | |20. |24/11 11:00-13:00| Presentation of projects/1 | {{ :geospatialanalytics:gsa:projects.pdf | List of projects}} | | Pappalardo, Nanni, Cornacchia, Mauro | |21. |25/11 9:00-11:00| Presentation of projects/2 | {{ :geospatialanalytics:gsa:projects.pdf | List of projects}} | | Pappalardo, Nanni, Cornacchia, Mauro | |22. |01/12 11:00-13:00| Guest Lecture | | | Fabrizio Martini from [[https://www.electravehicles.com/eve-ai-adaptive-controls?gclid=Cj0KCQjwy5maBhDdARIsAMxrkw0_ZxFvptzanIC5V7_LVjCawj_NzJ_FrTKWqx-5XQ_ulMJgNbTP4oQaAgpmEALw_wcB | Electra Vehicles]] | ===== Exam dates ===== The exam can be done on one of the following dates: * January 27th, 2023 * February 24th, 2023 The exam will start at 9:30 am and will be in Aula Faedo (C-29) at ISTI-CNR, Pisa. Remember to bring an identity document (mandatory) and your "libretto" (if any). Choose one of the two dates and remember that the project material must be submitted __**10 days before the chosen date**__ (i.e., January 17th and February 14th) through [[https://forms.gle/E4VqrfXtUTw55mZj7|this google form]]. The exam will consist of a discussion of the project and some questions about the course topics related to the project. The discussion of the project consists of a presentation by the student of the submitted notebook(s).