Questa è una vecchia versione del documento!
Andrea Esuli (firstname.lastname@example.org)
Office hours: by appointment, send email.
The course targets text analytics systems and applications to respond to business problems by discovering and presenting knowledge that is otherwise locked in textual form. The objective is to learn to recognize situations in which text analytics techniques can solve information processing needs, to identify the analytic task/process that best models the business problem, to select the most appropriate resources methods and tools, to collect text data and apply such methods to them. Several applications context will be presented: information extraction, sentiment analysis (what is the nature of commentary on an issue), spam and fake posts detection, quantification problems, summarization, etc.
Students MUST contact the teacher at least one month before the date set for the exam session, so as to agree on the contents of the project and get a go ahead.
The date set for the exam session (Check here) is the deadline for submitting the completed project (report and code).
Exam will consist in a project to be agreed with the teacher and an oral exam. The outcome of the project will be some code and a report of the activity (4-10 pages is the typical length range). Oral exam will consist in the presentation and discussion of the project.
The purpose of the project is to let you have some hands on experience on applying the concepts and methods seen during the course to practical text analytics problems.
Projects may be based on challenges proposed in either research forums (Semeval, Evalita) or other platforms (Kaggle). Students are also invited to propose a project on problem based on other sources (e.g., recent papers on ArXiv CL or AI), or their own interests.
Students may work solo or in groups up to three persons.
|2021/09/13||Introduction to the course, NLP & Text Analytics||00_-_introduction_to_the_text_analytics_course.pdf, 01_-_natural_language_and_text_analytics.pdf|
|2021/09/17||Introduction to probability||02_-_introduction_to_probability.pdf|
|2021/09/24||Introduction to python 1/2||03_-_introduction_to_python.pdf 03_1_introduction_to_python.zip|
|2021/09/27||Introduction to python 2/2|
|2021/10/01||Probabilistic Language Models 1/2||04_-_probabilistic_language_models.pdf 04_1_probabilisticlanguagemodel.zip|
|2021/10/04||Probabilistic Language Models 2/2|
|2021/10/08||Text Indexing: Regular expressions||05_-_text_indexing.pdf 05_1_strings_regular_expressions_and_bs4.zip|
|2021/10/11||Text Indexing: NLTK, Collocations||05_2_nltk.zip 05_3_collocations.zip|
|2021/10/15||Text Indexing: Spacy, Feature selection, Pipeline||05_4_spacy_text_processing.zip|
|2021/10/18||Text Indexing: Notebook. Introduction to Machine Learning||05_5_text_indexing_sklearn.zip 06_-_machine_learning_for_text_analytics.pdf|
|2021/10/22||Machine Learning for TA: Paradigms and models||06_-_machine_learning_for_text_analytics.pdf|
|2021/10/25||Machine Learning for TA: the complete pipeline||06_1_classification_sklearn.zip|
|2021/10/29||Machine Learning for TA: Feature engineering, Topic Modeling||06_2_classification_feature_engineering.zip 06.3_-_topic_modeling.pdf 06.4_-_topic_modeling.ipynb.zip|
|2022/11/05||Experimental protocols and optimization||07_-_experiments.pdf 07_1_optimization_sklearn.zip|
|2022/11/08||Information Extraction, Entity Annotation||08_-_information_extraction.pdf 08_1_spacy_ner_train.zip|
|2022/11/12||Data collection||09_-_data_collection.pdf 09_1_scraping.zip 09_2_data_from_twitter.zip|