Clustering and Classification with Python
The aim of this course is to teach you clustering and classification techniques using Python scientific packages such as SciPy, scikit-learn and Statsmodels.
Instructor: András Aszódi.
Topics
- Hierarchical clustering, k-means clustering.
- Gaussian mixtures, E-M algorithm.
- Quadratic and linear discriminant analysis.
- Principal Components Analysis.
- Cross-validation techniques.
- Logistic regression and its connection with discriminant analysis. Multinomial regression.
Out of scope
"Machine learning" is a huge subject and this course covers only the basics of classification methodologies.
Please note that we have no capacity to analyse private data sets.
Prerequisites
Please note that this is an advanced course. The following knowledge is required:
- Proficiency in Python3. If you attended our Python programming course then you are well prepared.
- Familiarity with NumPy, Pandas and a bit of MatPlotLib. Attend our Python scientific data packages course to learn NumPy and Pandas.
- Familiarity with linear algebra: vectors, matrices, scalar products etc.
- Solid probability theory and statistics knowledge.
Practical information
Number of participants: minimum 5, maximum 12.
Length: The course takes two half-days, from 09:00 to 13:00 on each day.