General Info

Teachers

When and where Friday 9-12, Bldg. 421, room 002. The course starts on September 8, 2017

Content Advanced state-of-the-art compact data structures. Entropy, coding, arrays, bitvectors, permutations, sequences, trees, graphs, grids, compressed text indexes, dynamic data structures.

Course format The course is a reading course, meaning that no ordinary lectures will take place. Instead, everyone is expected to read the material and do the exercises on their own at home. At our meetings we will go through the material together, discussing the difficult parts, repeating proofs on the whiteboard, etc. We will also explore open problems and identify possibly fruitful research directions. Everyone is expected to be able and willing to contribute to the discussion.

Material During the course we will go through the following book: Compact Data Structures: A Practical Approach, Gonzalo Navarro. We will also use research articles containing new material not present in the book.

Prerequisites Undergraduate level courses in algorithms and data structures (comparable to 02105 + 02110) and mathematical maturity. You should have a working knowledge of algorithm analysis (e.g. asymptotic notation, worst case analysis, amortized analysis, basic analysis of randomized algorithms) and data structures (e.g. stacks, queues, linked lists, trees, heaps, priority queues, hash tables, balanced binary search trees, tries).

Weekplan

The weekplan is preliminary. It will be updated during the course. Under each week there is the material that will be discussed during the lecture and the exercises that should be done by the lecture.

Week Topics Exercises Material
Entropy and coding Entropy-Huffman
  • Chapter 2
Arrays Bit tricks and partial sums
  • Chapter 3
Bitvectors (1) Entropy-compressed bitvectors
  • Sections 4.1 - 4.3 (included)
- - -
Bitvectors (2) and hashing Run-length strings
  • Sections 4.4 - 4.5 (included)
- - -
- - -
Wavelet trees Dynamic strings
  • Sections 6.2, 6.4, 10.1
Parentheses (1) -
  • From beginning of Chapter 7 until the end of Section 7.1 (included)
Parentheses (2), trees, RMQ 3D range minimum
  • Section 7.2, section 7.4.1, introduction of chapter 8 (pag 211,212), section 8.2 excluding sub-section 8.2.1 (so just pages 222-228)
Compressed indexes (1): CSA, FM-index Run-length compressed suffix arrays
  • Intro Chapter 11, sections 11.1, 11.2
Compressed indexes (2): The Burrows-Wheeler Transform Space-efficient CSA construction
  • Sections 11.3
Compressed indexes (3): XBWT, LZ-indexing, run-length indexes -
  • 11.6.2, intro 13.2, 13.2.1, 13.2.2, 13.2.4 until section "Sampling" (incl)
Optimal-time locate on FM indexes RLFM indexes
  • paper (Lemma 2, Lemma 3, Lemma 6, and Theorem 2)