Teacher: Paolo Ferragina
Question time: Monday, 16-19, Room 295, Dept of Computer Science.
Time table: Monday, 9-11 (Room C1); Wednesday, 14-16 (Room B).
In this course we will study, design and analyze (theoretically and experimentally) advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. These algorithmic tools will be designed and analyzed in several models of computation— such as RAM, 2-level memory, cache-oblivious, streaming— in order to take into account the architectural features and the memory hierarchy of modern PCs.
Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces algorithmic solutions aimed at minimizing the use of some computational resources like time, space, communication, I/O, etc. Some of these solutions will be discussed at an experimental level, in order to introduce proper engineering and tuning tools for algorithmic development.
If you wish to refresh your mind on Algorithms and Data Structures, I suggest you to follow the Video Lectures by Erik Demaine and Charles Leiserson, specifically Lectures 1-5, 7 and 10. There it is missing the part on basic graph problems (representation, DFS, BFS, topological sort) which you may browse in any book, such as Introduction to Algorithms by Cormen-Leiserson-Rivest-Stein, third edition.
Since the course covers many different algorithmic themes and will deal with advanced results and issues which are not yet part of books, I'll distribute notes and copies of papers/books which will constitute the reading material of the course.
I'll project no slides for teaching, and use just the old-fashioned blackboard.
Here follows a list of books from which I've taken some parts (details in the lectures):
|Background: If you wish to refresh your mind on basic Algorithms and Data Structures, please look at the above section “Background”.||Here it is the lecture's scheme to take notes.|
|21/09/2009||Introduction to the course: algorithm, asymptotics, computational models, experiments. Algorithm Engineering as a methodology: What's new? An example: max-subArray sum.||✔ [B]: column 7 “Algorithm design techniques”.|
|23/09/2009||Back of the Envelope calculations on the impact of Disks. Uniform sampling of M integers in [1,N].|| ✔ [B]: column 11 “Searching”.|
✔ Look also at this post and this other post for the reservoir sampling.
|Material first lecture||my notes|
|28/09/2009||No lecture today !|
|30/09/2009||Sorting on 2-level memories: multi-way mergesort, snow plow, and the complexity of sorting vs permuting.||Lower bound notes and Sect 5.7.1 of [MS09]|
|05/10/2009||Sorting on 2-level memories: sample-sort with multi-pivots. Sorting with multi-disks (sketch of Greed sort).||Chap 9 of [PS]|
|Material second lecture||my notes|
|07/10/2009||String sort: qsort-based, multi-key quicksort, radix sort||Chap 9 of [PS]|
|12/10/2009||String search: array of pointers, Compacted trie, Blind Trie, LPFC||Chap 28 of [MS05]|
|Material third lecture||my notes|
|14/10/2009||Scan-based text search: Karp-Rabin fingerprint, search with one pattern, search with many patterns.||Sect 4.4 in [G]|
|19/10/2009||Scan-based text search: Knuth-Morris-Pratt automata. The tool AGREP.||pp 149-154 in [M] and Sect 4-4.2 in [G]|
|Material fourth lecture||my notes|
|26/10/2009||Competitive analysis and on-line algorithms: search in a list with biased accesses. Move-To-Front: definition and analysis. Paging: LRU, FIFO and LFU.|
|28/10/2009||Competitive analysis and optimality of LRU. On-line randomized algorithms and oblivious adversary. Marking Algorithm and its analysis.|
|Material fifth lecture|
|02/11/2009||no lecture today!|
|04/11/2009||no lecture today!|
|09/11/2009||Search Engines: definition, structure. Inverted Lists: compressed storage (gamma, delta, Rice, interpolative, byte-aligned, <s,c>, PForDelta). Query resolution: two terms, more terms.|
|11/11/2009||Search Engines: text-based and link-based ranking.|
|Material sixth lecture||my notes|
|16/11/2009||Data compression: notation and terminology. Entropy. Huffman: classic and canonical.|
|18/11/2009||Data compression: Arithmetic.|
|20/11/2009||Data compression: Lempel-Ziv (LZ77, gzip, LZ78), Burrows-Wheeler Transform (bzip), MTF, RLE.|
|Material seventh lecture. Study Chap 4 of [WMB] except sections “computing huffman length”, “maintaining cumulative counts”, “prediction by partial matching”, “dynamic markov compression”, “word-based compression”, “lzw”, “synchronization”. In the paper on “MTF analysis”, study only the proof of theorem 1.||my notes|
|23/11/09||no lecture today!|
|25/11/09||no lecture today!|
|27/11/09||Suffix Trees and Suffix Arrays: definition, properties and example of applications.||Material eigth lecture.|
|30/11/09||Hashing: recalling the basics in chaining, Universal.|
|02/12/09||Perfect and Cuckoo hashing.||Material nineth lecture. No proof of theo/cor 11.6-11.8|
|07/12/09||Bloom filter: definition, properties, applications.||Material tenth lecture|
|09/12/09||Randomized dictionary: Skip lists.||Material eleventh lecture|