Strumenti Utente

Strumenti Sito


mds:txa:start

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisione Revisione precedente
Prossima revisione
Revisione precedente
mds:txa:start [10/10/2022 alle 10:04 (18 mesi fa)]
Lucia Passaro [Lecture Notes]
mds:txa:start [15/01/2024 alle 10:31 (3 mesi fa)] (versione attuale)
Laura Pollacci
Linea 1: Linea 1:
-====== Text Analytics (635AA) A.Y. 2022/23 ======+<html> 
 +<!-- Google Analytics --> 
 +<script type="text/javascript" charset="utf-8"> 
 +(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ 
 +(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), 
 +m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) 
 +})(window,document,'script','//www.google-analytics.com/analytics.js','ga'); 
 + 
 +ga('create', 'UA-34685760-1', 'auto', 'personalTracker', {'allowLinker': true}); 
 +ga('personalTracker.require', 'linker'); 
 +ga('personalTracker.linker:autoLink', ['pages.di.unipi.it', 'enforce.di.unipi.it', 'didawiki.di.unipi.it', 'luciacpassaro.github.io'] );     
 +ga('personalTracker.require', 'displayfeatures'); 
 +ga('personalTracker.send', 'pageview', 'courses/txa/'); 
 +setTimeout("ga('send','event','adjusted bounce rate','30 seconds')",30000);  
 +</script> 
 +<!-- End Google Analytics --> 
 +<!-- Global site tag (gtag.js) - Google Analytics --> 
 +<script async src="https://www.googletagmanager.com/gtag/js?id=G-LPWY0VLB5W"></script> 
 +<script> 
 +  window.dataLayer = window.dataLayer || []; 
 +  function gtag(){dataLayer.push(arguments);
 +  gtag('js', new Date()); 
 + 
 +  gtag('config', 'G-LPWY0VLB5W'); 
 +</script> 
 +<!-- Capture clicks --> 
 +<script> 
 +jQuery(document).ready(function(){ 
 +  jQuery('a[href$=".pdf"]').click(function() { 
 +    var fname = this.href.split('/').pop(); 
 +    ga('personalTracker.send', 'event',  'TXA', 'PDFs', fname); 
 +  }); 
 +  jQuery('a[href$=".r"]').click(function() { 
 +    var fname = this.href.split('/').pop(); 
 +    ga('personalTracker.send', 'event',  'TXA', 'Rs', fname); 
 +  }); 
 +  jQuery('a[href$=".zip"]').click(function() { 
 +    var fname = this.href.split('/').pop(); 
 +    ga('personalTracker.send', 'event',  'TXA', 'ZIPs', fname); 
 +  }); 
 +}); 
 +</script> 
 +</html> 
 +====== Text Analytics (635AA) A.Y. 2023/24 ======
  
  
 ==== Teacher ==== ==== Teacher ====
  
-[[http://luciacpassaro.github.io/|Lucia Passaro]] (lucia.passaro [at] unipi [dot] it)+[[https://laurapollacci.github.io/txa.html|Laura Pollacci]] (laura.pollacci [at] di [dot] unipi [dot] it)
  
-Office hours: Monday 16-18 via [[https://teams.microsoft.com/l/chat/0/0?users=lucia.passaro@unipi.it|Teams]]+Office hours: 
  
  
Linea 12: Linea 55:
  
 ^ Day ^ Hour ^ Room ^  ^ Day ^ Hour ^ Room ^ 
-Monday 9-11 | Fib M1 |+Thursday 16-18 | Fib C1 |
 | Friday| 11-13 | Fib M1 | | Friday| 11-13 | Fib M1 |
  
  
-[[https://teams.microsoft.com/l/team/19%3au_2NWnfXHAGPknxec1GtEY5y8UrjGRSAQjuJ1tySJ7w1%40thread.tacv2/conversations?groupId=414d90af-3f1a-4188-9dd7-21ac607e5c1f&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Team of the class]]+[[https://teams.microsoft.com/l/channel/19%3aiBnp7L1JmbHPmkQ3NcO3NrxPDZB-RhMvlQzMRdCrWFM1%40thread.tacv2/Generale?groupId=9e5370ba-93b4-41d0-b0b1-b7464ab92f11&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Team of the class]]
  
 ==== Objectives ==== ==== Objectives ====
Linea 41: Linea 84:
   * Transfer learning   * Transfer learning
   * Quantification   * Quantification
 +
  
 ==== Lecture Notes ==== ==== Lecture Notes ====
  
 ^ Date ^ Lecture ^ Slides ^ Material / Reference ^ ^ Date ^ Lecture ^ Slides ^ Material / Reference ^
-2022/09/16 | Introduction to the course, NLP & Text Analytics. | [[https://drive.google.com/file/d/1wc6yvn6Y5QrFXyFw53xeB4M6MsMmWssS/view?usp=sharing| 1 - Introduction to the Text Analytics course]]|J. Eisenstein. Introduction to Natural Language Processing. MIT Press.[[https://drive.google.com/file/d/17T4zo2uGssKBa_MrHsLW-uSmyP_ZJvpj/view?usp=sharing| Chp. 1]].| +2023/09/21 | Introduction to the course, NLP & Text Analytics. | [[https://drive.google.com/file/d/11BPheGG5YiZcNeFFQirMMSIrObEsbayf/view?usp=drive_link| 1 - Introduction to the Text Analytics course]]|J. Eisenstein. Introduction to Natural Language Processing. MIT Press.[[https://drive.google.com/file/d/1v455MySmNo5qVSRle676L0pjc4wktVNh/view?usp=drive_link| Chp. 1]].| 
-2022/09/19 | Reminds on Probability. Language and Probability. | [[https://drive.google.com/file/d/1-exk-JS0_Oa3Eg1ApTGlxonQlL3KbTQG/view?usp=sharing| 2 - Reminds on Probability.pdf]]+2023/09/22 | Reminds on probability. | [[https://drive.google.com/file/d/1fH8sjhnh9dlPcPMwpAYSbsP0tbCaamSV/view?usp=sharing| 2 - Reminds on probability]]| 
-2022/09/23 | Introduction to Python.| [[https://drive.google.com/file/d/1lpyA0N4K0d0ZTrJgokot1NwC_w4HG6gG/view?usp=sharing| 3 - Introduction to Python.pdf]]|[[https://drive.google.com/file/d/1BubwKtByCankjnbClWErvSsw9EjCLnte/view?usp=sharing|Introduction to Python Notebook.]]| +2023/09/28 | Introduction to Python. | [[https://drive.google.com/file/d/1fOn73KfDqlaU-0dgXs4-qkIbm8ZCg8Px/view?usp=sharing| 3 - Introduction to Python]]| [[https://drive.google.com/file/d/16BIcJuP4vB5b5oUmV03R7fX_-wRaFI8Y/view?usp=sharing | L3 Introduction_to_Python.ipynb]] | 
-2022/09/30 | Introduction to Python (continued). Project Presentation and Important Dates| [[https://drive.google.com/file/d/1FjCYvOkZDWomEsJuXD32Vl_155kxnKik/view?usp=sharing|Project and Dates]]+2023/09/29 | Introduction to Python - part 2. Project and Dates | [[https://drive.google.com/file/d/11E-3DWARykKVZDuB1vuDoXySAPPWYFoq/view?usp=sharing| 4 - Project and Dates]]|  
-2022/10/03 | Probabilistic Language Models. | [[https://drive.google.com/file/d/1B5HfPtPgK41Ig_NWrPim6YxK3mCF-XSj/view?usp=sharing| 5 - Probabilistic Language models]]|D. Jurafsky, J.H. Martin.[[https://drive.google.com/file/d/1OXSjwE0-ZN6DZ4MELOMp8JVy-tP2_4Iw/view?usp=sharingChp. 3]][[https://drive.google.com/file/d/1osuyJi5ZbBMghOrQz_IVqMsfxi2-1Vzj/view?usp=sharing| Probabilistic Language Models - Notebook]].+2023/10/05 | Probabilistic language models| [[https://drive.google.com/file/d/1Nj6FgcBSK9otmJwjDj2bxWWulCzPlHZb/view?usp=drive_link|5 - Probabilistic language models]]| D. Jurafsky, J.H. Martin. [[ https://drive.google.com/file/d/1K3B0s0-T3NnpfgmR6NGsZdwWqGoa0S5Q/view?usp=drive_link|Ch3]] [[https://drive.google.com/file/d/13r6wn4jlrOncZ0zUc5efmu2RgqDGUz2g/view?usp=drive_link|L5 Probabilistic Language Model.ipynb]] | 
-2022/10/07 | Text Indexding: Strings, Regular Expressions and BS4. | [[https://drive.google.com/file/d/1hkkjm5saUiKqL-9KgGgozTBupgIOus74/view?usp=sharing| 6 - Text Indexing-1]]|D. Jurafsky, J.H. Martin.[[https://drive.google.com/file/d/1RO_PGJj0a8v_N0dnw5iK4nGiabaAKZc5/view?usp=sharing| Chp. 2]][[https://drive.google.com/file/d/1IX8qSNdSbTFz5n1yMsMqtX9QU6HOofdv/view?usp=sharing| StringsRegular Expressions and BS4 - Notebook]].+2023/10/06| Text Indexding: Strings, Regular Expressions and BS4. | [[https://drive.google.com/file/d/1Zp6vqh5Wj9YzwtpcgMSxm7NUZ_oN8SW7/view?usp=sharing| 6 - Text indexing 1]] | D. Jurafsky, J.H. Martin. [[https://drive.google.com/file/d/1SH4Em84AEHNzc6OzrhjvW_ggo_0nJiOx/view?usp=sharing|Ch2]]  [[https://drive.google.com/file/d/13miwALDtad7ERoObFnlPjeUYBaAfwZGF/view?usp=sharing|L6.1 - Strings Regular expressions and BS4.ipynb]]| 
-2022/10/10 Text Indexding: Linguistic annotation. NLTK. | [[https://drive.google.com/file/d/11AjdH0K1W5OytgdofaxlCP_nQlI-rRTB/view?usp=sharing| 6 - Text Indexing-2]]|[[https://drive.google.com/file/d/1uigGIb0_9bX2Gb5g6SX51JHyN3kxN4y_/view?usp=sharing| Linguistic annotation with NLTK - Notebook]].|+2023/10/12| Linguistic annotation. NLTK. | [[https://drive.google.com/file/d/1t2WNuMZ1PAE4i_GgPbd-DCJWx8gWnhQf/view?usp=sharing| 6 - Text Indexing 2]]|[[https://drive.google.com/file/d/14ahCe4h45MHn_yMhUbOwO7o8Ms9jl-sD/view?usp=sharing|L6.2 - Linguistic annotation with NLTK.ipynb]] | 
 +|2023/10/13| //Lesson canceled due to UNIPI orientation days.//| 
 +|2023/10/19| Feature Selection| [[https://drive.google.com/file/d/1iWDaF7BXykUrRwOrIfc8ERlOewXaOQm7/view?usp=sharing|6 - Text Indexing 3]] | [[https://drive.google.com/file/d/1mD4v_ts0A1CHcTrU9nIYz-Nugvok1jks/view?usp=sharing |L6.3 - Gensim collocations - Stanza - Spacy (Notebooks)]] | 
 +|2023/10/20| Vector space models | [[https://drive.google.com/file/d/1JIKfDSAZh3raAfRB_tTGFNqjxgfukKGy/view?usp=sharing|6 - Text Indexing 4]] | D. Jurafsky, J.H. Martin. [[https://drive.google.com/file/d/1Hj3n4qCuZpTIrS_M352QAyH70xC6Fxrg/view?usp=share_link|Chp. 6.]] [[https://drive.google.com/file/d/1RUJYFizlp1ldl2DbmZDDXvw8WhDS6E4k/view?usp=sharing|L6.4 - Vector space model - toy example]]| 
 +|2023/10/26| //Lesson canceled//
 +|2023/10/27| //Lesson canceled//
 +|2023/11/02| Machine Learning for Text Analytics. | [[ https://drive.google.com/file/d/1zc925Q0yzdmh2nvB0McdQBOeVgJ1aD3R/view?usp=sharing| 10 - Machine Learning for Text Analytics]] - corrected| 
 +|2023/11/03| Machine Learning for Text Analytics: Design Experimental Protocols. Student presentations: How to. | [[https://drive.google.com/file/d/1gaaWVORZnp7gJ6ZGloKlyQTSw07SZ8in/view?usp=sharing| 11 - Design Experimental Protocols]]. [[https://drive.google.com/file/d/1b5I7NhRXuzjk93Pea6pyxzzCw31OhD8Z/view?usp=sharing| 11.1 - Student presentations: How to]] | [[https://drive.google.com/file/d/1X0BYS66px-aTYoDVZzx2sTixmaX4agrP/view?usp=sharing | L.11 - Classification with SkLearn]] | 
 +|2023/11/09| Student project presentations: proposal, brainstorming, discussion. | 
 +|2023/11/10| Student project presentations: proposal, brainstorming, discussion. | 
 +|2023/11/16| Topic Modeling | [[https://drive.google.com/file/d/1M7EMWkYfqDWZjf6W22yIVJLK0QbJTh_v/view?usp=sharing|12 - Topic Modeling]] | Zhai and Massung (2016) Text Data Management and Analysis. [[https://drive.google.com/file/d/1Cwzon44c0-7b_4bbHyUO6ArolacQFY_5/view?usp=sharing|Chp 17]]. [[https://drive.google.com/file/d/1-Iyz860uAII3pplAk_VMqi5gK5N_S4pD/view?usp=sharing |L.12 -Topic Modeling - Notebook.]]. [[https://drive.google.com/file/d/1H60PV4Wt5gRs_B6MB4J2YJ-gsiySf6lv/view?usp=sharing|L.12.1 - Topic Modeling pyLDAvis - Notebook]]| 
 +|2023/11/17| A primer on Neural Networks |[[https://drive.google.com/file/d/1MS7upbsydqkPMIRfYv9pKHXz2mfGb1ST/view?usp=sharing |13 - A primer on Neural Networks]] | 
 +|2023/11/23|Neural Networks | [[https://drive.google.com/file/d/13tQ1m-ogPR3R_PSAWLDomvPmsBal8E55/view?usp=sharing | 14 - Neural Networks]] | [[https://drive.google.com/file/d/1ZP9WN4OTSw2VoO7jWIpJlWBh_oGFwxjN/view?usp=sharing| From SVM to NN, Classification with Keras - Notebooks.]] | 
 +|2023/11/24| Neural Language Models | [[https://drive.google.com/file/d/1vezeT7l6Wd9D0otEYXSAjg0ih1XoggmW/view?usp=sharing| 15 - Neural Language Models]]| D. Jurafsky, J.H. Martin. Chps. [[https://drive.google.com/file/d/10SjSlr4bk6jBWTEkA4vsTUomB8y4iJ-C/view?usp=sharing|7]] [[https://drive.google.com/file/d/1MkfAsC-rY6HuWM6ZTS1TB8LoLxN-sPPq/view?usp=sharing|9]] [[https://drive.google.com/file/d/1P3j4qTH6IH_R42huYLL83cvPd1Ci2Ar1/view?usp=sharing|11]] | 
 +|2023/11/30| Student project presentations: ongoing experiments. Neural Language Models Practice | [[https://drive.google.com/file/d/1Dc0l2zQfX9poOymZKrhYiHMUiv9TT7m_/view?usp=sharing|16 - Neural Language Models Word2Vec]]| [[https://drive.google.com/file/d/14BIROGvYzNjbmmVzZqeiY-tLkhRAR8tW/view?usp=sharing |Word2vec - Notebook.]]| 
 +|2023/12/01| Student project presentations: ongoing experiments. Neural Language Models Practice | [[https://drive.google.com/file/d/1R4Yfr5v8ygsK61dV-h-mZhU_iY0OuZmK/view?usp=sharing|17 - Neural Language Models Doc2Vec]]|[[https://drive.google.com/file/d/1JaGXJE-rF3Yvmtd1Je8NCdDapLiL17Pg/view?usp=sharing|Doc2Vec - Notebook]]| 
 +|2023/12/07| Neural Language Models - part 2 |[[https://drive.google.com/file/d/1QxmavpSIjX1x46UkNR1RflY64Sbc3vLs/view?usp=sharing|Neural Language Models - part 2]]| 
 +|2023/12/11| BERT. Project Submission |[[https://drive.google.com/file/d/1JX6HCObZYtLUApYJDl1ftDTl5nKn-aHi/view?usp=sharing| 19 - Bert]]. [[https://drive.google.com/file/d/1GOwUTqWnkONM-SI8D0JANGKuqX0pBp35/view?usp=sharing|Project Submission]]| [[ https://drive.google.com/file/d/1JX6HCObZYtLUApYJDl1ftDTl5nKn-aHi/view?usp=sharing|Bert - Notebooks]] | 
 +|2023/12/14| Advanced Topics | [[ https://drive.google.com/file/d/14zg2w7-s_cpIJQBwGfXoj_yfjZNZLYQh/view?usp=sharing |20 - Advanced Topics]]| Recommended chapters: D. Jurafsky, J.H. Martin. [[https://drive.google.com/file/d/1ik_BGxKUNAi5GwQZQv4vI9Gqvv4wkWK9/view?usp=sharing|20]];[[https://drive.google.com/file/d/1VJbNelq63EagAxdgleJu2isJVBBb_vkl/view?usp=sharing|24]].|  
 ==== Exam ==== ==== Exam ====
  
Linea 63: Linea 126:
  
 The exam for non attending students will consist in a written exam with open question and exercises, and an oral discussion on the topics of the course. The exam for non attending students will consist in a written exam with open question and exercises, and an oral discussion on the topics of the course.
 +
 +Written test [[https://drive.google.com/file/d/1Q-NVz_x-UjllTG-CPAKGV4aKmK4Hz5af/view?usp=share_link|example]].
  
  
Linea 76: Linea 141:
 ==== Previous editions ==== ==== Previous editions ====
  
 +  * [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1671529070|2022-2023]]
   * [[http://didawiki.cli.di.unipi.it/doku.php/mds/txa/start?rev=1649067582|2021-2022]]   * [[http://didawiki.cli.di.unipi.it/doku.php/mds/txa/start?rev=1649067582|2021-2022]]
   * [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1612257498|2020-2021]]   * [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1612257498|2020-2021]]
mds/txa/start.1665396258.txt.gz · Ultima modifica: 10/10/2022 alle 10:04 (18 mesi fa) da Lucia Passaro