Andrea Esuli (email@example.com)
Office hours: by appointment, send email.
Lectures will be given using Microsoft Teams. Join the Text Analytics Team here.
Lecture recording is available on Microsoft Teams for delayed viewing.
The course targets text analytics systems and applications to respond to business problems by discovering and presenting knowledge that is otherwise locked in textual form. The objective is to learn to recognize situations in which text analytics techniques can solve information processing needs, to identify the analytic task/process that best models the business problem, to select the most appropriate resources methods and tools, to collect text data and apply such methods to them. Several applications context will be presented: information extraction, sentiment analysis (what is the nature of commentary on an issue), spam and fake posts detection, quantification problems, summarization, etc.
Exam will consist in a project to be agreed with the teacher and an oral exam. The outcome of the project will be some code and a report of the activity (4-10 pages is the typical length range). Oral exam will consist in the presentation and discussion of the project.
The purpose of the project is to let you have some hands on experience on applying the concepts and methods seen during the course to practical text analytics problems.
Projects may be based on challenges proposed in either research forums (Semeval, Evalita) or other platforms (Kaggle). Students are also invited to propose a project on problem based on other sources (e.g., recent papers on ArXiv CL or AI), or their own interests.
Students may work solo or in groups up to three persons.
Before starting working on the project students must contact the teacher so as to agree on the contents of the project and get a go ahead.
|2020/09/16||Introduction to the course||00_-_introduction_to_the_text_analytics_course.pdf 01_-_natural_language_and_text_analytics.pdf|
|2020/09/17||Introduction to probability||02_-_introduction_to_probability.pdf|
|2020/09/23||Setup of Python environment||03_-_introduction_to_python.pdf|
|2020/09/24||Introduction to Python||03_1_introduction_to_python.zip|
|2020/09/30||Probabilistic Language Models||04_-_probabilistic_language_models.pdf|
|2020/10/01||Probabilistic Language Models||04_1_probabilisticlanguagemodel.zip|
|2020/10/07||Text Indexing, Regular expressions||05_-_text_indexing.pdf 05.1_-_strings_regular_expressions_and_bs4.zip|
|2020/10/08||NLTK, Collocations||05.2_-_nltk.zip 05.3_-_collocations.zip|
|2020/10/14||NLP tools, Spacy, Text indexing, preprocessing||05.4_-_spacy_text_processing.ipynb.zip|
|2020/10/15||Vector space model, ML for text analytics||06_-_machine_learning_for_text_analytics.pdf|
|2020/10/21||Scikit learn, pipeline||06_1_classification_sklearn.zip|
|2020/10/28||Experimental protocols, optimization||07_-_experiments.pdf 07_1_optimization_sklearn.zip|
|2020/10/29||Sequence labeling, information extraction||08_-_information_extraction.pdf|
|2020/11/05||Data collection||09_-_data_collection.pdf 09_1_scraping.zip 09_2_data_from_twitter.zip|
|2020/11/11||Introduction to neural networks||10_-_a_primer_on_neural_networks.pdf 10.1_-_example_of_backpropagation.pdf|
|2020/11/12||From SVM to NN, deep learning||10_2_svm_to_nn.zip|
|2020/11/18||Convolutional and Recurrent networks, text generation||10_3_classification_cnnnet.zip 10_4_classification_lstmnet.zip 10_5_textgeneration.zip|
|2020/11/19||Word embeddings, neural language models||11_-_neural_language_models.pdf 11_1_wordembeddings.zip|
|2020/11/25||Document embeddings, the Transformer||11_2_documentembeddings.zip|
|2020/11/26||BERT fine-tuning||11_3_bert_finetune_binary.zip 11_4_bert_finetune_multiclass.zip 11_5_simpletransformers_finetune_binary.zip 11_6_simpletransformer_generation_and_representation.zip|
|2020/12/9||Introduction to Sentiment Analysis, Sentiment Lexicons||13_-_sentiment_analysis.pdf 14_-_lexical_resources_for_sentiment_analysis.pdf|
|2020/12/10||Sentiment Classification||15_-_sentiment_classification.pdf 15_1_vader.zip|