Clustering Algorithms

Topics
Module Information
Exams
Dates
Lecture Notes, Slides, and Exercises
Literature

Topics

In this course we are going to study an important tool to analyze collected data: clustering. Clustering is the process of dividing data into useful or sensible groups. A sensible division should resemble the data's natural structure. Sometimes the goal is that each cluster should contain as many items of a similar kind as possible ( for example in data compression). Clustering is a very natural way to analyze and structure data. Especially in natural sciences we are working with data whose structure is unknown to us. An example is the human DNA, that humankind is trying to decode. Clustering can be a very powerful tool in such cases.

Module Information

Module: MuA
Course number: L.079.05721
V2 + Ü1 SWS (contact time)
4 ECTS credits (workload)
Useful previos knowledge: Einführung in Berechenbarkeit, Komplexität und formale Sprachen, Datenstrukturen und Algorithmen, Wahrscheinlichkeitsrechnung

Exams

The oral exams for Clustering Algorithms have to be planned individually. Send your requests of an examination date to Claudia Jahn (claudia.jahn(at)upb.de) and the second professor of the module using the email form for the type A exam from http://www-old.cs.uni-paderborn.de/en/students/examinations/registering-for-examinations.html.

Dates

Lecture: Thursday, 11-13, F1.110
Tutorials:Monday, 12-13, F2.211 (CANCELLED)
Thursday, 13-14, F1.110

Note: The tutorials start in the second week.

Lecture Notes, Slides, and Exercises

Lecture

Introduction (Printer-friendly)
k-Means (Printer-friendly) [Updated 06.11.15 - 15:00]
KLD-Clustering (Printer-friendly)
k-Means++ (Printer-friendly)
Constant Factor k-Means (Printer-friendly)
Agglomerative Clustering (Printer-friendly) [Updated 16.12.15 - 17:37]
dbscan (Printer-friendly)
Johnson-Lindenstrauss
SVD (Printer-friendly)
Mixture Models and the EM Algorithm [Updated 04.02.2016 - 10:35]

Exercise Sheets

Exercise 1
Exercise 2
Exercise 3
Exercise 4
Exercise 5
Exercise 6
Exercise 7
Exercise 8 [Updated Ex.2 12.01.16 - 09:05]
Exercise 9
Exercise 10
Exercise 11

Literature

David J.C MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press or online: http://www.inference.phy.cam.ac.uk/itprnn/book.pdf
Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer Science+Business Media, 2006