Entrambe le parti precedenti la revisione
Revisione precedente
Prossima revisione
|
Revisione precedente
Prossima revisione
Entrambe le parti successive la revisione
|
mds:txa:start [02/02/2021 alle 09:17 (3 anni fa)] Andrea Esuli [Exam] |
mds:txa:start [05/11/2021 alle 09:43 (2 anni fa)] Andrea Esuli [Lecture Notes] |
====== Text Analytics A.Y. 2020/21 ====== | ====== Text Analytics (635AA) A.Y. 2021/22 ====== |
| |
| |
==== Schedule ==== | ==== Schedule ==== |
| |
Lectures will be given using Microsoft Teams. | ^ Day ^ Hour ^ Room ^ |
[[https://teams.microsoft.com/l/team/19%3ad515c158b0c64bc8b4efa3b21aab6fa7%40thread.tacv2/conversations?groupId=205782a2-623f-4f50-a391-9d4f22d4d604&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Join the Text Analytics Team here.]] | | Monday | 9-11 | Fib C - [[https://teams.microsoft.com/l/channel/19%3aPNQvMI4MdxtWb0_5d1r1UoIPA9QHxRe6kOuZ9VHTG-I1%40thread.tacv2/General?groupId=109c5615-d2be-49ef-848f-85c7aa07de5b&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams]]| |
| | Friday| 11-13 | Fib M1 - [[https://teams.microsoft.com/l/channel/19%3aPNQvMI4MdxtWb0_5d1r1UoIPA9QHxRe6kOuZ9VHTG-I1%40thread.tacv2/General?groupId=109c5615-d2be-49ef-848f-85c7aa07de5b&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Teams]]| |
Lecture recording is available on Microsoft Teams for delayed viewing. | |
| |
^ Day ^ Hour ^ Room ^ | |
| Wednesday | 9-11 | [[https://teams.microsoft.com/l/team/19%3ad515c158b0c64bc8b4efa3b21aab6fa7%40thread.tacv2/conversations?groupId=205782a2-623f-4f50-a391-9d4f22d4d604&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Text Analytics Team]] | | |
| Thursday | 9-11 | [[https://teams.microsoft.com/l/team/19%3ad515c158b0c64bc8b4efa3b21aab6fa7%40thread.tacv2/conversations?groupId=205782a2-623f-4f50-a391-9d4f22d4d604&tenantId=c7456b31-a220-47f5-be52-473828670aa1|Text Analytics Team]] | | |
| |
| |
| |
==== Exam ==== | ==== Exam ==== |
| |
| **__Students MUST contact the teacher at least one month before the date set for the exam session__, so as to agree on the contents of the project and get a go ahead.** |
| |
| **The date set for the exam session ([[https://esami.unipi.it/esami/findcourse.php?id=52426|Check here]]) is the deadline for submitting the completed project (report and code).** |
| |
Exam will consist in a project to be agreed with the teacher and an oral exam. | Exam will consist in a project to be agreed with the teacher and an oral exam. |
The purpose of the project is to let you have some hands on experience on applying the concepts and methods seen during the course to practical text analytics problems. | The purpose of the project is to let you have some hands on experience on applying the concepts and methods seen during the course to practical text analytics problems. |
| |
Projects may be based on challenges proposed in either research forums ([[https://alt.qcri.org/semeval2020/|Semeval]], [[http://www.evalita.it/|Evalita]]) or other platforms ([[https://kaggle.com|Kaggle]]). Students are also invited to propose their on problem based on other sources (e.g., recent papers on ArXiv [[https://arxiv.org/list/cs.CL/new|CL]] or [[https://arxiv.org/list/cs.AI/new|AI]]), or their own interests. | Projects may be based on challenges proposed in either research forums ([[https://alt.qcri.org/semeval2020/|Semeval]], [[http://www.evalita.it/|Evalita]]) or other platforms ([[https://kaggle.com|Kaggle]]). Students are also invited to propose a project on problem based on other sources (e.g., recent papers on ArXiv [[https://arxiv.org/list/cs.CL/new|CL]] or [[https://arxiv.org/list/cs.AI/new|AI]]), or their own interests. |
| |
Students may work solo or in groups up to three persons. | Students may work solo or in groups up to three persons. |
| |
Before starting working on the project students must contact the teacher so as to agree on the contents of the project and get a go ahead. | |
| |
==== Lecture Notes ==== | ==== Lecture Notes ==== |
| |
^ Date ^ Lecture ^ Notes ^ | ^ Date ^ Lecture ^ Notes ^ |
| 2020/09/16 | Introduction to the course | {{ :mds:txa:00_-_introduction_to_the_text_analytics_course.pdf |}} {{ :mds:txa:01_-_natural_language_and_text_analytics.pdf |}} | | | 2021/09/13 | Introduction to the course, NLP & Text Analytics | {{ :mds:txa:00_-_introduction_to_the_text_analytics_course.pdf |}}, {{ :mds:txa:01_-_natural_language_and_text_analytics.pdf |}} | |
| 2020/09/17 | Introduction to probability | {{ :mds:txa:02_-_introduction_to_probability.pdf |}} | | | 2021/09/17 | Introduction to probability | {{ :mds:txa:02_-_introduction_to_probability.pdf |}} | |
| 2020/09/23 | Setup of Python environment | {{ :mds:txa:03_-_introduction_to_python.pdf |}} | | | 2021/09/20 | //canceled//| | |
| 2020/09/24 | Introduction to Python | {{ :mds:txa:03_1_introduction_to_python.zip |}} | | | 2021/09/24 | Introduction to python 1/2 | {{ :mds:txa:03_-_introduction_to_python.pdf |}} {{ :mds:txa:03_1_introduction_to_python.zip |}}| |
| 2020/09/30 | Probabilistic Language Models | {{ :mds:txa:04_-_probabilistic_language_models.pdf |}} | | | 2021/09/27 | Introduction to python 2/2 | | |
| 2020/10/01 | Probabilistic Language Models | {{ :mds:txa:04_1_probabilisticlanguagemodel.zip |}} | | | 2021/10/01 | Probabilistic Language Models 1/2 | {{ :mds:txa:04_-_probabilistic_language_models.pdf |}} {{ :mds:txa:04_1_probabilisticlanguagemodel.zip |}}| |
| 2020/10/07 | Text Indexing, Regular expressions | {{ :mds:txa:05_-_text_indexing.pdf |}} {{ :mds:txa:05.1_-_strings_regular_expressions_and_bs4.zip |}} | | | 2021/10/04 | Probabilistic Language Models 2/2 | | |
| 2020/10/08 | NLTK, Collocations | {{ :mds:txa:05.2_-_nltk.zip |}} {{ :mds:txa:05.3_-_collocations.zip |}} | | | 2021/10/08 | Text Indexing: Regular expressions | {{ :mds:txa:05_-_text_indexing.pdf |}} {{ :mds:txa:05_1_strings_regular_expressions_and_bs4.zip |}} | |
| 2020/10/14 | NLP tools, Spacy, Text indexing, preprocessing | {{ :mds:txa:05.4_-_spacy_text_processing.ipynb.zip |}} | | | 2021/10/11 | Text Indexing: NLTK, Collocations | {{ :mds:txa:05_2_nltk.zip |}} {{ :mds:txa:05_3_collocations.zip |}} | |
| 2020/10/15 | Vector space model, ML for text analytics | {{ :mds:txa:06_-_machine_learning_for_text_analytics.pdf |}} | | | 2021/10/15 | Text Indexing: Spacy, Feature selection, Pipeline | {{ :mds:txa:05_4_spacy_text_processing.zip |}} | |
| 2020/10/21 | Scikit learn, pipeline | {{ :mds:txa:06_1_classification_sklearn.zip |}} | | | 2021/10/18 | Text Indexing: Notebook. Introduction to Machine Learning | {{ :mds:txa:05_5_text_indexing_sklearn.zip |}} {{ :mds:txa:06_-_machine_learning_for_text_analytics.pdf |}}| |
| 2020/10/22 | Feature engineering | {{ :mds:txa:06_2_classification_feature_engineering.zip |}} | | | 2021/10/22 | Machine Learning for TA: Paradigms and models | {{ :mds:txa:06_-_machine_learning_for_text_analytics.pdf |}} | |
| 2020/10/28 | Experimental protocols, optimization | {{ :mds:txa:07_-_experiments.pdf |}} {{ :mds:txa:07_1_optimization_sklearn.zip |}}| | | 2021/10/25 | Machine Learning for TA: the complete pipeline | {{ :mds:txa:06_1_classification_sklearn.zip |}} | |
| 2020/10/29 | Sequence labeling, information extraction | {{ :mds:txa:08_-_information_extraction.pdf |}} | | | 2021/10/29 | Machine Learning for TA: Feature engineering, Topic Modeling | {{ :mds:txa:06_2_classification_feature_engineering.zip |}} {{ :mds:txa:06.3_-_topic_modeling.pdf |}} {{ :mds:txa:06.4_-_topic_modeling.ipynb.zip |}} | |
| 2020/11/04 | Inception, spacy | {{ :mds:txa:08_1_spacy_ner_train.zip |}} | | | 2022/11/05 | Experimental protocols and optimization | {{ :mds:txa:07_-_experiments.pdf |}} {{ :mds:txa:07_1_optimization_sklearn.zip |}} | |
| 2020/11/05 | Data collection | {{ :mds:txa:09_-_data_collection.pdf |}} {{ :mds:txa:09_1_scraping.zip |}} {{ :mds:txa:09_2_data_from_twitter.zip |}} | | | 2022/11/08 | | | |
| 2020/11/11 | Introduction to neural networks | {{ :mds:txa:10_-_a_primer_on_neural_networks.pdf |}} {{ :mds:txa:10.1_-_example_of_backpropagation.pdf |}}| | | 2022/11/12 | | | |
| 2020/11/12 | From SVM to NN, deep learning | {{ :mds:txa:10_2_svm_to_nn.zip |}} | | | 2022/11/15 | | | |
| 2020/11/18 | Convolutional and Recurrent networks, text generation | {{ :mds:txa:10_3_classification_cnnnet.zip |}} {{ :mds:txa:10_4_classification_lstmnet.zip |}} {{ :mds:txa:10_5_textgeneration.zip |}}| | | 2022/11/19 | | | |
| 2020/11/19 | Word embeddings, neural language models | {{ :mds:txa:11_-_neural_language_models.pdf |}} {{ :mds:txa:11_1_wordembeddings.zip |}}| | | 2022/11/22 | | | |
| 2020/11/25 | Document embeddings, the Transformer | {{ :mds:txa:11_2_documentembeddings.zip |}} | | | 2022/11/26 | | | |
| 2020/11/26 | BERT fine-tuning | {{ :mds:txa:11_3_bert_finetune_binary.zip |}} {{ :mds:txa:11_4_bert_finetune_multiclass.zip |}} {{ :mds:txa:11_5_simpletransformers_finetune_binary.zip |}} {{ :mds:txa:11_6_simpletransformer_generation_and_representation.zip |}}| | | 2022/11/29 | | | |
|2020/12/2 | Parsing | {{ :mds:txa:12_-_parsing-1.pptx |}} | | | 2022/12/03 | | | |
|2020/12/3 | Parsing | {{ :mds:txa:12_-_parsing-2.pptx |}} | | |
|2020/12/9 | Introduction to Sentiment Analysis, Sentiment Lexicons | {{ :mds:txa:13_-_sentiment_analysis.pdf |}} {{ :mds:txa:14_-_lexical_resources_for_sentiment_analysis.pdf |}} | | |
|2020/12/10| Sentiment Classification | {{ :mds:txa:15_-_sentiment_classification.pdf |}} {{ :mds:txa:15_1_vader.zip |}} | | |
| |
==== Textbooks ==== | ==== Textbooks ==== |
==== Previous editions ==== | ==== Previous editions ==== |
| |
| * [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1612257498|2020-2021]] |
* [[https://elearning.di.unipi.it/course/view.php?id=162|2019-2020]] | * [[https://elearning.di.unipi.it/course/view.php?id=162|2019-2020]] |
* [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1551450538|2018-2019]] | * [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1551450538|2018-2019]] |
* [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1515682954|2017-2018]] | * [[http://didawiki.di.unipi.it/doku.php/mds/txa/start?rev=1515682954|2017-2018]] |
| |