General Info

Teacher

Teaching Assistants

Lectures/exercise sessions

  • Tuesdays 18.00-22.00.
  • Location

    Live streaming and recordings of the lectures

    Physical and virtual attendance to lectures and exercise sessions

    The capacity of classrooms has been reduced this year due to the covid-19 pandemic. As a result, students will need to take turns participating in lectures and exercise sessions physically and virtually. To find out which weeks you should attend physically, and which weeks you should attend virtually, see the info provided in the welcome page of the course on CampusNet.

    Materials

    The lectures are backed reading material from various sources. These should be seen at suggestions. There's a huge community behind the tools we are working with in this course. Suggested reading materials can be found in the Weekplan below.

    Lecture slides and exercise

    Lecture slides and exercises are made available as Colab notebooks. See the Weekplan below.

    Weekplan

    Further information and materials will be posted soon. In the first 4 weeks, we'll introduce the basic computational tools for data science with Python. In weeks 5-12, we will cover more advanced topics such as streaming, parallel computation and relational databases.

    Week Topics Slides Exercises Materials
    1: Sept 1
  • Python brush-up. (No lecture)
  • Self-study Self-study A Whirlwind Tour of Python, learnpython.org
    2: Sept 8
  • Numerical Computing with NumPy.
  • Getting started with Jupyter and Google Colaboratory.
  • Colab notebook Colab notebook Python Data Science Handbook, Ch. 2
    3: Sept 15
  • Manipulating Tabular Data with Pandas.
  • Exploratory Data Analysis with Pandas.
  • Benchmarking and profiling.
  • Colab notebook Colab notebook Python Data Science Handbook, Ch. 3, Kaggle Pandas tutorials, Python for Data Analysis Book, from Ch. 5
    4: Sept 22
  • Data Visualisation with Matplotlib, Pandas profiler, plotly.
  • Statistical analysis and machine learning with scikit-learn.
  • Colab notebook Colab notebook Python Data Science Handbook, Ch. 4-5.
    5: Sept 29
  • Presentation of Project 1
  • Exercise session on Project 1
  • 6: Oct 6
  • No lecture. Exercise session on Project 1.
  • 7: Oct 13 Holiday week
    8: Oct 20
  • Apache Spark 1
  • 9: Oct 27
  • Apache Spark 2
  • 10: Nov 3
  • Presentation of Project 2
  • Databases
  • Mandatory assignments

    Project Released Due Problem file Contribution to final grade
    Project 1 Tuesday, September 29 Monday, November 2, 20:00 37.5 %
    Project 2 Tuesday, November 3 Monday, November 30, 20:00 37.5 %
    Project 3 Tuesday, December 1 Sunday, December 23, 20:00 25 %

    Frequently Asked Questions

    Can I skip lectures/classes due to conflicting courses, travelling, ...? The is no requirement for attendance, but we recommend attending for support and coaching.