NEWS
 News during the course will be updated here.
 2015.08.31 : The lecture on August 31 has been moved to September 1, 1012, in BL33, see below or the changed schedule in TimeEdit.
 2015.08.24 : Lecture notes can be downloaded from the lecture plan below and will be uploaded at the latest the day before each lecture.
 20150824 : Link to a Statistics Dictionary, EnglishSwedish, from Aalto University, Finland, can be found here.
 2015.08.17 : It is allowed to bring the course book to the written exam, but only the original version or the international edition of the book.
Aim
The aim of this course is to introduce statistical concepts and principles in enough detail to make it possible to perform statistical analyses in situations where standard textbook formulas do not apply. This requires a deeper and more mathematical understanding of probability and statistical inference. The focus will be on those part of the theory that will be most useful for practical statistical work.
Intended audience
This course is given mainly to students on the third year of the Bachelor’s programme Statistik och Dataanalys. It is also offered to students on the Master’s programme Statistics and Data Mining who have had little previous exposure to probability theory and statistical inference above the basic level.
Outline
The first half of the course contains probability theory with particular emphasis on univariate and multivariate random variables and their distributions. Additional concepts covered in this part is conditional distributions, distributions of functions of random variables, law of large numbers and central limit theorems, and simulation methods.
The second half of the course is concerned with statistical inference. Maximum likelihood and its properties is presented in detail. Bayesian inference is given an extensive treatment. Point and interval estimation, sampling distributions and hypothesis testing are also covered.
Organization
The course is organized into 14 lectures, 5 exercise sessions, 2 computer labs and 6 tutorials. The lectures include a presentation of the theory and its application in practical work. The theory is illustrated on problem solving exercises. The computer labs give the student an opportunity to deepen their understanding of the theory and its applications in a practical computeraided setting. The tutorials encourage studentcentred learning with a greater opportunity for learning by doing. A detailed plan of the lectures, exercise sessions, computer labs and tutorials is given below.
Literature
 Probability and Statistics by Degroot and Schervish, Pearson, Fourth edition, 2011. The book’s web site can be found here.
 My Slides.
Lectures
What?  When?  Where?  Read?  Contents  Exercises 
Lecture 1  Tue Aug 25 1315  JvN 
1.11.11 and 2.12.3

Review of basic probability calculus

1.7.5, 1.7.7, 1.8.7, 2.3.4, 2.3.13.

Lecture 2  We Aug 26 1315  JvN 
3.13.3

Univariate random variables, density and distribution functions.

3.1.6, 3.2.2, 3.2.8, 3.3.4, 3.3.5. 
Lecture 3  Fri Aug 28 1012  JvN 
3.3 (The Quantile function), 3.8 (pages 167169 and 172173)

Quantiles. Functions of random variables. 
3.8.1, 3.8.2, 3.8.3, 3.8.6, 3.8.8.

Tutorial 1  Fri Aug 28 1314  JvN 
Solve exercises from Sections 13

Various exercises from Sections 13  
Lecture 4  Tis Sep 1 1012  BL33 
3.43.7, 5.10

Bivariate, marginal, conditional and multivariate distributions. 
3.4.4, 3.5.3, 3.6.2, 3.6.4, 3.7.8

Exercise 1  Wed Sep 2 1315  JvN 
Solve exercises from Sections 13

Various exercises from Sections 13  
Lecture 5  Thu Sep 3 1012  JvN 
4.14.7

Mean, variance, moment generating function. Gauss approximation formulas. Conditional expectation and variance 
4.1.1, 4.2.2, 4.2.4, 4.2.9, 4.3.1, 4.3.5, 4.4.6, 4.4.8, 4.5.3, 4.6.10, 4.7.7

Lecture 6  Tue Sep 8 1315  JvN 
5.15.2, 5.4, 5.65.9

Common discrete and continuous distributions  5.2.6, 5.2.7, 5.4.8, 5.6.2, 5.6.6, 5.6.17, 5.7.1, 5.7.6, 5.8.3. 
Lecture 7  Thu Sep 10 1315  JvN 
6.16.3 (skip “The Delta Method”)

Law of large numbers and central limit theorem.  6.2.2, 6.2.3, 6.2.5, 6.3.9. 
Tutorial 2  Thu Sep 10 1516  JvN 
Solve exercises from Sections 46

Various exercises from Sections 46  
Exercise 2  Tue Sep 15 1012  JvN 
Solve exercises from Sections 46

Various exercises from Sections 46  
Lecture 8  Thu Sep 17 1012  JvN 
12.112.2, page 170171, 12.3

Simulation 
3.8.11, 12.1.3, 12.3.4

Computer lab 1  Tu Sep 22 1315  PC 35  Lab 1  Simulating from common distributions. Functions of variables. Central limit theorem.  
Lecture 9  Thu Sep 24 1012  JvN 
7.17.3

Statistical inference. Bayesian inference.  
Lecture 10  Tue Sep 29 1517  JvN 
7.17.3

Statistical inference. Bayesian inference.  7.2.2, 7.2.10, 7.2.11, 7.3.10, 7.3.11, 7.3.19. 
Tutorial 3  We Sep 30 1314  JvN 
Solve exercises from Sections 7.27.3

Various exercises from Sections 7.27.3  
Lecture 11  Thu Oct 1 1012  JvN 
7.47.6

Point and interval estimation. Maximum likelihood. Method of moments.  7.4.12, 7.5.6, 7.5.7, 7.5.9, 7.5.11, 7.6.2, 7.6.9, 7.6.18, 7.6.20, 7.6.23. 
Computer lab 2  Fri Oct 2 1012  PC 35 
Maximum likelihood estimates and standard deviations from numerical optimization methods.


Tutorial 4  Fri Oct 2 1314  JvN 
Solve exercises from Section 7

Various exercises from Section 7  
Exercise 3  Tue Oct 6 1012  JvN  Solve exercises from Section 7  Various exercises from Section 7  
Lecture 12  Thu Oct 8 1012  JvN 
8.18.2 and 8.4

Sampling distributions. Chisquared and studentt.  8.1.9, 8.2.10 
Lecture 13  Fri Oct 9 1315  JvN 
8.5, 8.78.8

Confidence intervals. Unbiased estimators. Fisher information.  8.5.6, 8.5.7, 8.7.1, 8.8.3, 8.9.15 
Lecture 14  Tue Oct 13 1012  JvN 
9.1, 9.5 and 9.7

Hypothesis testing

9.1.3, 9.5.4, 9.7.7 
Tutorial 5  Tue Oct 13 1314  JvN 
Solve exercises from Sections 89

Various exercises from Sections 89  
Exercise 4  Thu Oct 15 1012  JvN 
Solve exercises from Sections 89

Various exercises from Sections 89


Preparation for exam  Fri Oct 16 1012  JvN 
Various exercises


Tutorial 6  Tu Oct 20 1012  AT  Preparation for exam 
JvN = room John von Neumann
AT = Alan Turing
Computer labs
There are 2 computer labs in the course. A bonussystem of points is used for the lab assignments, where 1 point is added to the written exam for a passed lab assignment that was submitted in time. I suggest you use the open source programming language R to solve the lab problems, but you can use any program you like (e.g. SAS). In the time schedule above you will soon find links to the lab assignment instructions.
Lab assignment 1
 Last submission: September 29
 Corrected back: October 6
Lab assignment 2
 Last submission: October 9
 Corrected back: October 16
Exams
 Written exam. Here are the dates and registration info.
 Past exam, January 10, 2015 Exam
 Past exam, October 23, 2014 Exam  Solutions
 Past exam, March 22, 2014 Exam  Solutions
 Past exam, January 11, 2014 Exam  Solutions
 Past exam, October 24, 2013 Exam  Solutions
 Past exam, March 27, 2013 Exam  Solutions
 Past exam, December 19, 2012 Exam  Solutions
 Past exam, October 31, 2012 Exam  Solutions
Extra material
 Interactive spreadsheets for 22 common statistical distributions.
 Collection of Wikipedia articles on Statistical Distributions.
 Informative clickable chart with relations between distributions: http://www.johndcook.com/distribution_chart.html.
 Spreadsheet file for learning about the priortoposterior updating of the parameters in the Bernoulli model. Google Docs  Excel file
 Spreadsheet file for learning about the priortoposterior updating of the parameters in the Normal model. Google Docs  Excel file
 Spreadsheet file for learning about the priortoposterior updating of the parameters in the Poisson model. Google Docs
Code
 R is a great free open source easytouse programming language for statistical computations. There are thousands of packages with statistical routines for almost any imaginable field of statistics. Do this:
1. Download and install R from http://ftp.sunet.se/pub/lang/CRAN/
2. Download and install RStudio from www.rstudio.org. RStudio is a complete environment for R.
3. Read the intro to R: http://cran.rproject.org/doc/manuals/Rintro.pdf
4. Start writing code!  An introduction to R commands and specialized R commands for computer lab 1 and 2 can be found here:
Intro to R, lab 1 and lab 2.  Link to generate random numbers in SAS can be found here:
Generate random numbers in SAS.  OptimizeSpam.R. Finding the posterior mode and approximate covariance matrix by numerical optimization methods. This code fits a logistic or probit regression model to the spam data from the book Elements of Statistical Learning. Its a good example since the optimization for the logistic model is very stable, but the probit is more problematic.
Pingback: About my pages  Research and teaching by Bertil Wegmann