Strumenti Utente

Strumenti Sito


biss2010:start

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisione Revisione precedente
Prossima revisione
Revisione precedente
biss2010:start [20/02/2010 alle 11:46 (14 anni fa)]
Paolo Ferragina
biss2010:start [10/03/2010 alle 18:22 (14 anni fa)] (versione attuale)
Paolo Ferragina
Linea 1: Linea 1:
 ====== Advanced Algorithms for Massive Datasets @ BISS2010 ====== ====== Advanced Algorithms for Massive Datasets @ BISS2010 ======
  
-**Teacher:** Paolo Ferragina+**Teacher:** Paolo Ferragina ([[http://www.di.unipi.it/~ferragin|home]], [[mailto:ferragina@di.unipi.it|email]])
  
 **Period:** Second week of March 2010  **Period:** Second week of March 2010 
 +
 ===== Summary ===== ===== Summary =====
  
Linea 9: Linea 10:
  
 Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces basic and sophisticated algorithmic solutions aimed at minimizing the use of computational resources like time, space, communication, I/O, energy consumption, etc.. The theoretical investigation will go hand-in-hand with some algorithm-engineering considerations.  Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces basic and sophisticated algorithmic solutions aimed at minimizing the use of computational resources like time, space, communication, I/O, energy consumption, etc.. The theoretical investigation will go hand-in-hand with some algorithm-engineering considerations. 
- 
 ===== Lectures: topics and material ===== ===== Lectures: topics and material =====
  
-**Lect 1.** Introduction to (Modern) Computational Models. Sorting vs Permuting. [slides, {{:biss2010:lect1-sorting.zip|material}}]+**Lecture 1.** Introduction to (Modern) Computational Models. Sorting vs Permuting. [{{:biss2010:ferragina-biss-day1.zip|slides}}, {{:biss2010:lect1-sorting.zip|material}}]
  
-**Lect 2.** Hashing: Uniform, Universal, Perfect, Cuckoo, Bloom Filters. [slides, {{:biss2010:lect2-hash.zip|material}}]+**Lecture 2.** Hashing: Uniform, Universal, Perfect, Cuckoo. [{{:biss2010:ferragina-biss-day2a.zip|slides}}, {{:biss2010:lect2-hash.zip|material}}]
  
-**Lect 3.** Dictionaries: From arrays to (Patricia) tries, from plain storage to (Locality Preserving) Front Coding. [slides, {{:biss2010:lect3-trie.zip|material}}]+**Lecture 3.** Dictionaries: From arrays to (Patricia) tries and Bloom Filters, from plain storage to (Locality Preserving) Front Coding. [{{:biss2010:ferragina-biss-day3.zip|slides}}, {{:biss2010:lect3-trie.zip|material}}]
  
-**Lect 4.** Text indexing and mining: suffix trees and arrays. Some mining queries, and the issue "LCA=RMQ". [slides, {{:biss2010:lect4-suffixtreesuffixarray.zip|material}}+**Lecture 4.** Text indexing and mining: suffix trees and arrays. Some mining queries, and the issue "LCA=RMQ". [{{:biss2010:ferragina-biss-day4.zip|slides}}, {{:biss2010:lect4-suffixtreesuffixarray.zip|material}}]
- +
-**Lect 5.** Data compression: basics, Huffman, Arithmetic, bzip2 and beyond. [slides, {{:biss2010:lect5-compression.zip|material}}]+
  
 +**Lecture 5.** Data compression: basics, Huffman, Arithmetic, bzip2 and beyond. [{{:biss2010:ferragina-biss-day5.zip|slides}}, {{:biss2010:lect5-compression.zip|material}}]
  
 ===== Exam ===== ===== Exam =====
  
-The exam for this course consists either in the solution of a bunch of exercises, or in the execution of the analytical experiment over a challenging dataset using one of the algorithms teached in the lectures, or finally in the design and implementation of a new algorithmThe software project can be conducted in team (no more than 3 students), and should be preferably close to the research interest of the candidate. +The exam consists either in the solution of a bunch of exercises, or in the attack of one of the software project listed belowEach project can be conducted in team (no more than 3 students), and should be preferably close to the research interest of the candidate. 
  
-The students willing to give the exam should send me (//ferragina@di.unipi.it//) an email with subject [BISS09], specifying the chosen subject for the work, and the list of participants in the team. Once negotiated, the assigned teamwork will be inserted in this wiki. The exam must be completed within 2010.  +The students willing to give the exam should send me an [[mailto:ferragina@di.unipi.it|email]] with subject [BISS09], specifying the chosen subject for the work, and the list of participants in the team. Once negotiated, the assigned teamwork will be inserted in this wiki. The exam must be completed within 2010. 
- +
- +
-===== List of Students ===== +
- +
-These are the students who attended the course: +
- +
-  - aa +
-  - aa +
-  - +
 ===== List of Projects ===== ===== List of Projects =====
  
   * Suggester for typing [inspiring papers? {{:biss2010:paperchato.pdf|one}}, {{:biss2010:article.zip|two}}]   * Suggester for typing [inspiring papers? {{:biss2010:paperchato.pdf|one}}, {{:biss2010:article.zip|two}}]
   * Problem posed by Chakrabharti [inspiring papers? {{:biss2010:chakrabhartisearch1.pdf|one}}, {{:biss2010:chakrabhartisearch2.pdf|two}}]    * Problem posed by Chakrabharti [inspiring papers? {{:biss2010:chakrabhartisearch1.pdf|one}}, {{:biss2010:chakrabhartisearch2.pdf|two}}] 
-  * Permuting Web pages to improve compression ratio [inspiring papers? {{:biss2010:wsdm022-ferragina.pdf|one}}, {{:biss2010:clusteringtext.pdf|two}}, {{:biss2010:ferraginawebcompression.tgz|software}}, [[http://brie.di.unipi.it/_UKOriginal1Gb.warc.bz2|UK-crawl]], [[http://brie.di.unipi.it/_UKURLsorted1Gb.warc.bz2|UK-URLsort]]] +  * Permuting Web pages to improve compression ratio [inspiring papers? {{:biss2010:wsdm022-ferragina.pdf|one}}, {{:biss2010:clusteringtext.pdf|two}}, {{:biss2010:suelwww2010reorder.pdf|three}}, {{:biss2010:ferraginawebcompression.tgz|software}}, [[http://brie.di.unipi.it/_UKOriginal1Gb.warc.bz2|UK-crawl]], [[http://brie.di.unipi.it/_UKURLsorted1Gb.warc.bz2|UK-URLsort]]] 
-  * Generalised BWT [inspiring papers? {{:biss2010:generalisedbwt.pdf|one}}]+  * Generalised BWT [inspiring papers? {{:biss2010:generalisedbwt.pdf|one}}, [[http://people.unipmn.it/manzini/bwtdisk/|sparring partner]], [[http://brie.di.unipi.it/_UKOriginal1Gb.warc.bz2|UK-crawl]],]
   * Temporal data mining on a DB of cars [ [[http://brie.di.unipi.it/autoscout24.tgz|dataset]] ]   * Temporal data mining on a DB of cars [ [[http://brie.di.unipi.it/autoscout24.tgz|dataset]] ]
-  * Smart compression: + 
-      * Variable-block size depending on the number of phrases ({{:biss2010:lect3-lpfc.pdf|paper}}, [[http://www.zlib.net/|gzip]]) +
-      * Fix a bound on Compression-ratio, minimize the number of phrases (hence decompression time). +
-      * Fix a bound on Decompression-time, minimize the compression ratio. +
-  * Energy-aware algorithms: examples of inefficient algorithms which use less battery!+
biss2010/start.1266666411.txt.gz · Ultima modifica: 20/02/2010 alle 11:46 (14 anni fa) da Paolo Ferragina