Strumenti Utente

Strumenti Sito


mds:txa:start

Questa è una vecchia versione del documento!


Text Analytics A.Y. 2020/21

Teacher

Andrea Esuli (andrea.esuli@isti.cnr.it)

Office hours: by appointment, send email.

Schedule

Lectures will be given using Microsoft Teams. Join the Text Analytics Team here.

Day Hour Room
TBA TBA Text Analytics Team
TBA TBA Text Analytics Team

Objectives

The course targets text analytics systems and applications to respond to business problems by discovering and presenting knowledge that is otherwise locked in textual form. The objective is to learn to recognize situations in which text analytics techniques can solve information processing needs, to identify the analytic task/process that best models the business problem, to select the most appropriate resources methods and tools, to collect text data and apply such methods to them. Several applications context will be presented: information extraction, sentiment analysis (what is the nature of commentary on an issue), spam and fake posts detection, quantification problems, summarization, etc.

  1. Disciplinary background: Natural Language Processing, Information Retrieval and Machine Learning
  2. Mathematical background: Probability, Statistics and Algebra
  3. Linguistic essentials: words, lemmas, morphology, PoS, syntax
  4. Basic text processing: regular expression, tokenisation
  5. Data collection: twitter API, scraping
  6. Basic modelling: collocations, language models
  7. Introduction to Machine Learning: theory and practical tips
  8. Libraries and tools: NLTK, Spacy, Keras, pytorch
  9. Classification/Clustering
  10. Sentiment Analysis/Opinion Mining
  11. Information Extraction/Relation Extraction/Entity Linking
  12. Transfer learning
  13. Quantification

Exam

Exam will consist in a project to be agreed with the teacher and an oral exam. The outcome of the project will be some code and a report of the activity (4-10 pages is the typical length range). Oral exam will consist in the presentation and discussion of the project.

Lecture Notes

Date Lecture Notes
yy/mm/dd some topic link to slides

Textbooks

  1. D. Jurafsky, J.H. Martin, Speech and Language Processing. 3nd edition, Prentice-Hall, 2018.
  2. B. Liu, Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, 2012.
  3. S. Bird, E. Klein, E. Loper. Natural Language Processing with Python.

Previous editions

mds/txa/start.1599473738.txt.gz · Ultima modifica: 07/09/2020 alle 10:15 (4 anni fa) da Andrea Esuli