Computational Biology Trainings

by András Aszódi

Basic statistics with Python

"It's easy to lie with statistics. It's hard to tell the truth without it."
— Andrejs Dunkels, Latvian-Swedish mathematician

The aim of this course is to teach you how to perform basic statistical analysis using Python. First we review the foundations (sampling theory, discrete and continuous distributions), then we focus on classical hypothesis testing. This course will improve your generic statistics knowledge. We cannot go into the specific data analysis problems of your particular project.


This course teaches the same statistical concepts as the Basic statistics with R training but uses the Python programming language.

Out of scope

This course will not teach you bioinformatics. In particular, no high-throughput sequencing data will be used because they are impractically large, and not everyone on campus is working with sequencing.

If you are interested in the statistical background of gene expression analysis with high-throughput sequencing, then please take our RNA-Seq data analysis course.


Python-3 knowledge is required. In particular the following skills are necessary:

If you have attended our Python programming training then you are well equipped to take this course. We will use NumPy, SciPy, Pandas and MatPlotLib, the necessary details to use these packages will be explained during the course.

Practical information

Number of participants: minimum 5, maximum 10.

Length: The course takes two half-days, from 09:00 to 13:00 with 2 breaks.