Computational Biology Trainings

by András Aszódi

Statistics with R

The aim of this course is to teach you how to perform basic statistical analysis using R. First we review the foundations (sampling theory, discrete and continuous distributions), then we focus on classical hypothesis testing. This course will improve your generic statistics knowledge. We cannot go into the specific data analysis problems of your particular project.


This course teaches the same statistical concepts as the Basic statistics with Python training but uses the R programming language.

Out of scope

This course will not teach you bioinformatics. In particular, no high-throughput sequencing data will be used because they are impractically large, and not everyone on campus is working with sequencing.

If you are interested in the statistical background of gene expression analysis with high-throughput sequencing, then please take our RNA-Seq data analysis course.


Basic familiarity with R is required. In particular the following skills are necessary:

If you have attended our R as a programming language training then you are well equipped to take this course.

"Bring Your Own Data"

You can bring your own data to this course and run a "Student"'s t-test on it.

Please prepare a comma-separated-values (CSV) file with UNIX line endings (\n) that contains two columns corresponding to the two groups of data whose means you would like to compare. The size of the two groups need not be the same.

Practical information

Number of participants: minimum 5, maximum 10.

Length: The course takes two half-days, from 09:00 to 13:00 with 2 breaks.