Linguistic Data: Quantitative Analysis and Visualisation: linguistic theory

Материал из MathINFO
Перейти к навигации Перейти к поиску
  • Instructors: Ilya Schurov and Ivan Pozdnyakov


Date Topics Links video
Jan 11 Introduction. Quantitative linguistic research and data types. R basic notebook lecture, practice
Jan 18 Measures of statistical dispersion. Variance and standard deviation. script lecture, practice
Jan 25 Statistical hypothesis testing. Binomial test. lecture, practice
Feb 1 Estimate of mean. Central limit theorem video
Feb 8 One sample t-test seminar Rmd preview video
Feb 15 Review Rmd, preview video
Feb 27 Two-sample t-test. One-sided and two-sided alternatives video
March 15 Chi-squared test video
April 5 Multiple comparisons problem. ANOVA (Analysis of Variances) video
April 12 Confidence intervals. Correlation video
April 19 Bivariate regression video
April 26 Multivariate regression and causal questions video
May 15 More on linear regressions. Categorical variables in regression models video
May 17 Non-linear dependence in linear regressions. Interactions video
May 24 Logistic regression video
May 29 Mixed effects models video


Academic ethics policy: you have to do your homeworks by yourself. In case of academic cheating (e.g. if you copy someone else's work, etc.), your work will receive grade 0 and the program supervisor will be notified. If you feel that you are stuck with the homeowork, ask for instructor's advices and hints.

Late penalties: in case of late submission, your grade will be multiplied by exp(-t / 86400), where t is the number of seconds since the due date. For example, if you delay the submission by one day, your grade will be multiplied by exp(-1)=0.3678794412.

Extensions: you can ask for up to two extensions of homework due dates during the course. Each extension is one week. Extensions due to valid excuses (i.e. illness) do not count.

Homework #1

Complete the following chapters in DataCamp Introduction to R course:

Due date: 2021-01-31 23:00 MSK. Late penalties are applied to each chapter independently.

Homework #2

Due date: 2021-02-14 23:00 MSK.

Homework #3

Due date: 2021-03-16 23:00 MSK


During this course we will use R as a programming language and RStudio as a GUI.

How to install R and RStudio?

1. Download R (you can choose another mirror here if you wish) and install it on your computer. Make sure you did it before installing RStudio.

2. Download RStudio (you need RStudio Desktop Open Source License) and install it on your computer. It is recommended to create a shortcut for RStudio during installation.

It is possible avoid installing anything on your PC, using (an online version of RStudio).

For successful submission of assignments you should be able to create and save R code files (.R) and RMarkdown files (.Rmd).

Final project

Online course

We will use some parts of DataCamp online course Introduction to R. (Will be made available free for the students.)


  • Gries, Stefan (2013). Statistics for Linguistics with R : A Practical Introduction (Vol. 2nd revised edition). Berlin: De Gruyter Mouton. HSE library link
  • Levshina, Natalia (2015). How to Do Linguistics with R : Data Exploration and Statistical Analysis. Amsterdam: John Benjamins Publishing Company. HSE library link
  • Baayen, Harald (2008). Analyzing Linguistic Data: A practical introduction to statistics. Cambridge UP. link