Topics
In this course we are going to study an important tool to analyze collected data: clustering. Clustering is the process of dividing data into useful or sensible groups. A sensible division should resemble the data's natural structure. Sometimes the goal is that each cluster should contain as many items of a similar kind as possible ( for example in data compression). Clustering is a very natural way to analyze and structure data. Especially in natural sciences we are working with data whose structure is unknown to us. An example is the human DNA, that humankind is trying to decode. Clustering can be a very powerful tool in such cases.
Module Information
- Module: MuA
- Course number: L.079.05721
- V2 + Ü1 SWS (contact time)
- 4 ECTS credits (workload)
- Useful previos knowledge: Einführung in Berechenbarkeit, Komplexität und formale Sprachen, Datenstrukturen und Algorithmen, Wahrscheinlichkeitsrechnung
Exams
The oral exams for Clustering Algorithms have to be planned individually. Send your requests of an examination date to Claudia Jahn (claudia.jahn(at)upb.de) and the second professor of the module using the email form for the type A exam from http://www-old.cs.uni-paderborn.de/en/students/examinations/registering-for-examinations.html.
Dates
- Lecture: Thursday, 11-13, F1.110
- Tutorials:Monday, 12-13, F2.211 (CANCELLED)
- Thursday, 13-14, F1.110
Note: The tutorials start in the second week.
Lecture Notes, Slides, and Exercises
Lecture
- Introduction (Printer-friendly)
- k-Means (Printer-friendly) [Updated 06.11.15 - 15:00]
- KLD-Clustering (Printer-friendly)
- k-Means++ (Printer-friendly)
- Constant Factor k-Means (Printer-friendly)
- Agglomerative Clustering (Printer-friendly) [Updated 16.12.15 - 17:37]
- dbscan (Printer-friendly)
- Johnson-Lindenstrauss
- SVD (Printer-friendly)
- Mixture Models and the EM Algorithm [Updated 04.02.2016 - 10:35]
Exercise Sheets
- Exercise 1
- Exercise 2
- Exercise 3
- Exercise 4
- Exercise 5
- Exercise 6
- Exercise 7
- Exercise 8 [Updated Ex.2 12.01.16 - 09:05]
- Exercise 9
- Exercise 10
- Exercise 11
Literature
- David J.C MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press or online: http://www.inference.phy.cam.ac.uk/itprnn/book.pdf
- Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer Science+Business Media, 2006