**Teachers**

- Inge Li Gørtz, inge@dtu.dk, office hours Friday 12.00-12.45, building 322, office 018.
- Eva Rotenberg, erot@dtu.dk, office hours Tuesday 12.30-13.00.
- Philip Bille, phbi@dtu.dk

**When and where** Monday 8.15-12, TBA.
The course runs in the DTU fall semester. There
is no teaching during the autumn break.

**Prerequisites** Undergraduate level courses in algorithms and data structures (comparable to 02105 + 02110) and mathematical maturity. You should have a working knowledge of algorithm analysis (e.g. asymptotic notation, worst case analysis, amortized analysis, basic analysis of randomized algorithms), data structures (e.g. stacks, queues, linked lists, trees, heaps, priority queues, hash tables, balanced binary search trees, tries), graph algorithms (e.g. BFS, DFS, single source shortest paths, minimum spanning trees, topological sorting), dynamic programming, divide-and-conquer, and NP-completeness (e.g. basic reductions).

**The weekplan is preliminary** It will be updated during the course. Under each week there is a number of suggestions for reading material regarding that weeks lecture. It is not the intention that you read ALL of the papers. It is a list of papers and notes where you can read about the subject discussed at the lecture.

Week | Topics | Slides | Weekplan | Mandatory | Material |
---|---|---|---|---|---|

Streaming I: distinct element sketch. |
1x1 · 4x1 | (tba) |
R. Morris:
Counting Large Numbers of Events in Small Registers P. Flajolet: Approximate Counting: A Detailed Analysis J. S. Vitter: Random Sampling with a Reservoir |
||

Streaming II: count min sketch. |
1x1 · 4x1 | (tba) | TBA. | ||

Streaming III: graph sketching. |
1x1 · 4x1 | (tba) | TBA. | TBA. | |

I/O I | 1x1 · 4x1 | (tba) | TBA. | ||

I/O II: B-trees. | 1x1 · 4x1 | (tba) | TBA. | ||

I/O III: Bε-trees. | 1x1 · 4x1 | (tba) | TBA. | TBA. | |

Approximate Data Structures I: Bloom filters. | 1x1 · 4x1 | (tba) | TBA. | ||

Approximate Data Structures II: Approximate Near Neighbour. | 1x1 · 4x1 | (tba) | TBA. | ||

Approximate Data Structures III. | 1x1 · 4x1 | (tba) | TBA. | TBA. | |

Parallel I: map-reduce. | 1x1 · 4x1 | (tba) | TBA. | ||

Parallel II: distributed computing. | 1x1 · 4x1 | (tba) | TBA. | ||

Parallel III: distributed computing. | 1x1 · 4x1 | (tba) | TBA. | TBA. |

Use the template.tex file to prepare your hand in exercises. Do not repeat the problem statement in your hand in. Compile using LaTeX. Upload the resulting pdf file (and only this file) via DTU Learn. The maximum size of the finished pdf must be at most 2 pages. An exercise from week x must be handed in no later than Sunday in week x before 20.00.

**Collaboration policy for mandatory exercises**

- You may collaborate in groups of up to 3 students on the mandatory exercises. The collaborators must be listed in your solution (see template).
- Collaboration is limited to discussion of ideas only, and you should write up the solutions entirely on your own.
- Being a collaborator is a symmetric relation. Only list people as collaborator if they also list you as a collaborator.
- Do not use or seek out solutions from previous years of the course, solutions from similar courses, or solutions found on the internet.

**How should I write my mandatory exercises?** The ideal writing format for mandatory exercises is classical scientific writing, such as the writing found in the peer-reviewed articles listed as reading material for this course (not textbooks and other pedagogical material). One of the objectives of this course is to practice and learn this kind of writing. A few tips:

- Write things directly: Cut to the chase and avoid anything that is not essential. Test your own writing by answering the following question: “Is this the shortest, clearest, and most direct exposition of my ideas/analysis/etc.?”
- Add structure: Don’t mix up description and analysis unless you know exactly what you are doing. For a data structure explain following things separately: The contents of the data structure, how to build it, how to query/update it, correctness, analysis of space, analysis of query/update time, and analysis of preprocessing time. For an algorithm explain separately what it does, correctness, analysis of time complexity, and analysis of space complexity.
- Be concise: Convoluted explanations, excessively long sentences, fancy wording, etc. have no in place scientific writing. Do not repeat the problem statement.
- Try to avoid pseudocode: Generally, aim for human readable description of algorithms that can easily and unambiguously be translated into code.
- Examples for support: Use figures and examples to illustrate key points of your algorithms and data structures.

**How much do the mandatory exercises count in the final grade?**
The final grade is an overall evaluation of your mandatory exercise and the oral exam combined. Thus, there is no precise division of these part in the final grade. However, expect that (in most cases, and under normal circumstances) the mandatory exercises account for a large fraction of the final grade.

**Can I write my assignments in Danish?**
Ja. Du er meget velkommen til at aflevere på dansk.

**What do I do if I want to do a MSc/BSc thesis or project in Algorithms?** Great! Algorithms is an excellent topic to work on :-) and Algorithms for Massive Data Sets is designed to prepare you to write a strong thesis. Some basic tips and points.

- Let us know well in advance: Identifying an interesting problem in algorithms that matches your interest can take time. With enough time to go over the related litterature and study up on relevant topics your project will likely be more succesful. It may also be a good idea to do an initial “warm up” project before a large thesis to test ideas or survey an area.
- Join the community: It is very good idea to enter the local algorithms community at DTU and the Copenhagen area to get a feel for what kind of stuff you could work on for your thesis and what thesis work algorithms is about. Talk to other students doing thesis work in algorithms. Go to algorithms talks and thesis defenses in algorithms.
- Collaborate: We strongly encourage you to do your thesis in pairs. We think that having a collaborator to discuss with greatly helps in many aspects of thesis work in algorithms. Our experience confirms this.
- No strings attached. Choosing a topic for your thesis is important. You are welcome to discuss master thesis topics with us without pressure to actually write your thesis in algorithms. We encourage you to carefully select your topic.