The TA will have some material for the tutorial.  The handouts used for the tutorial are here .

Old term tests and exams are posted on the course OWL page (this is different than the page I used last year).

 

Handouts used in the lectures.

 

Jan 8,  2018 :  course outline and ch8-2intro.pdf

This uses data from the text,   It comes with a CD with the text.  The data set is also available from http://www.stat.berkeley.edu/~rice/Book3ed/ where you click on the Data Sets link.  It comes in various formats.  Often csv format is most useful as many statsitical packages can read these, including R.  ASCII is also a useful version in the same sense.

BEESWAX.DAT

coal.txt

illinois60.txt  this is a data set that we use for some gamma fit data examples later in the course.

Jan  10 :  Ch8-StatModels.pdf   This material is not from the text, but some background material to Chapters 8 and 9.

Jan 12 :  Continue with example from Jan 10 handout. 

brief introduction to empirical distribution function (edf) with an R example.   edf1.R

This R script is an Rgui file, but you can copy these commands to an Rstudio script if you prefer to use Rstudio

January 15, 2018 :

The student can work through the first at home, and the second we will look at in class. At this point only part of these R scripts can be used. It also includes some methods on how to find the approximate variance and its estimator, of

sqrt(n)*( theta.hat_n – theta)

which is based on the so called delta method to be discussed later in the course.

compare-estimators.r

TwoEstimatorsPoisson.r

Jan 17, 2018 : continue with

ch8-3-estimator-prop.pdf    Some handout files are named to indicate they are based on or give examples for sections in the Rice text.  Here the handout refers to Chapter 8 section 3.

ch8-3-Consistency-ContinuousFunctions.pdf

 

Jan 22, 2018

The next two files give a review of two main methods of constructing estimators, the method of moments and maximum likelihood. In this course we also find how to obtain their approximate distributions.

Ch8-4MethodofMoments.pdf This is a review of a topic in the second year course Stat 2858

Ch8-5-MLE-I.pdf Most of this handout is also a review of material from Stat 2858

Jan 24, 2018

Continue with MLE-I handout from Jan 22

Ch8-5-MLE-II

FisherInfoComments.pdf relation between Fisher’s information for Binomial and iid Bernoulli experiments

gamma fit with method of moments R script

Some data sets for which a gamma model is appropriate

illinois60 , illinois61 , illinois62 , illinois63 , illinois64

Also the coal data set earlier in this file follows a gamma distribution

 January 31, 2018

Normal CI normalCI.r This file considers CI for the parameters for a normal model. It also introduces the idea of confidence interval coverage rates.



Reminder there is a test next week on February 7 during the tutorial time. It will be held in WSC 240 (tutorial room) and WSC 248 (since there are too many students for exam writing in WSC 240). About 20 student will write in WSC 248.

Material covered : The lecture material in the handouts up to today’s class, but not including the delta method. Material in the text Sections 8.2 , 8.3 , 8.4 , 8.5. There will be no questions on R coding or implementation, even though this is important for statistical applications.

The exam will include a formula sheet, which is given here formulae-Stat3858.pdf

It will also include the statistical tables, as needed, from the Appendix in Rice’s text.

Normal table normal table



February 2, 2018 : I the previous couple of lectures we obtained a method to approximate the distribution of the nomalized MLE in the regular case.



Generally for the method of moments estimators there is also a normal approximation for normalized estimators.



DeltaMethod.pdf This allows us to find a normal approximation for the distribution of method of moments estimators in many cases – look at this after Fisher’s information

 gamma method of moments R script examples

 Some files for next week



 Sect 8.7 and 8.8 Cramer Rao These are the beginning of understanding why MLE is the best estimator, at least in the regular case.



February 9, 2018 : Continue with Sec 8.7-8 handout

Feb 12, 2018 :

There is one other method based on simulation to approximate the (sampling) distribution of some “test statstistic” - ie the type needed to construct confidence intervals. These methods are bootstrap methods. For now we consider the parametric bootstrap, and then later in the course the nonparametric bootstrap.

Bootstrap.pdf and review the later part of Ch8-5-MLE-II.pdf

Here is an R script example to describe this method, so we can see how it is related to simulation methods described earlier. This material is briefly described in Rice Secion 8.5, page 284-85.

parametricboot.r

coal.txt





February 12 and 14, 2018

Chapter 9.1 hypothesis testing

Neyman Pearson lemma

 Feb 26, 2018 :

Term test 1

Min. 1st Qu. Median Mean 3rd Qu. Max.

15.0 29.5 37.5 35.7 42.8 50.0





February 16, 2018 and continue Feb 28, 2018

material related to chapter 9 Rice.

ExponentialLikelihoodRatio.pdf



MultinomialLikelihoodRatio.pdf



GLRHardyWeinberg.r This data is also discussed in the text Sec 8.3, 9.3



March 2, 2018

Changing the date of a midterm test from the course outline date requires unanimous agreement. Several students in the past couple of days said do not agree with moving the test date from March 14 to March 21, so the term test date for test 2 has to stay on March 14, 2018 as given in the original posted course outline.

Material covered for term test 2 :

The material will be the topics covered since the first tern test, but may include some of the earlier material. For example in calculation of GLR statistics one needs to calculate MLE with various restrictions of constraints. Some of the theorems used also have conditions or assumptions such as the two regularity assumptions. For this test we will NOT include R coding, so parametric bootstrap will not be on the test.

Material for text sections : 8.7, 8.8 , 9.1, 9.2, (9.3 will be discussed later so it is not on this term test), 9.4 (which also includes 13.3, 13.4).





As a direct application of multinomial models we skip for now to Chapter 13, sections 3 and 4.

ContingencyTables.pdf This has been reposted (March 11, 2018) with a few typos corrected



cold-eg.r a simple 2 by 2 contingency table example

The Jane Austen data set and example is from Rice.

Sec13-3-JAustenData.txt

 

 JaneAusten-eg.r

 

 March 5, 2018 : Continue with contingency table applicationof GLR

Section 9.3 relation hypothesis testing to Confidence intervals



March 9, 2018:

This next reminder is from the web page above March 2, 2018 for the second term test.

Changing the date of a midterm test from the course outline date requires unanimous agreement. Several students in the past couple of days said do not agree with moving the test date from March 14 to March 21, so the term test date for test 2 has to stay on March 14, 2018 as given in the original posted course outline.

Material covered for term test 2 :

The material will be the topics covered since the first tern test, but may include some of the earlier material. For example in calculation of GLR statistics one needs to calculate MLE with various restrictions of constraints. Some of the theorems used also have conditions or assumptions such as the two regularity assumptions. For this test we will NOT include R coding, so parametric bootstrap will not be on the test.

Material for text sections : 8.7, 8.8 , 9.1, 9.2, (9.3 will be discussed later so it is not on this term test), 9.4 (which also includes 13.3, 13.4).

March 9, 2018

Continue with GLR two sample problems,. This material is not on the term test, but gives another example of working with the GLR and manipulating the rejection region to put it in a simpler to interpret form.

Chapter 11,.2 two independent normal samples

twosamp-normal.pdf



After this section we will study other more general test of some model assumptions.

Ch9-Goodness-of-fit.pdf



Momapprox.pdf : this is a handout I have used in Stat 3657. It reviews moment approximations and in particular the last few pages discsuses the construction of QQ plots. Most students have used qqnorm in R or in some equivalent program or packag

qqplotsintro.r An R script for examples of constructing and implement QQ plts



March 16, 2018

Use qq plots to examine the students’ desk measurement data from an earlier class.

Desk2018.csv and DeskFeb2017.r (R script to help recognize an outlier, one part of data preprocessing)



Nonparametric bootstrap

Nonparametric bootstrap

boot-1.r simple code to implement nonparametric bootstrap, not using a package

boot-1b.r The data used in boot-1.r is obtained from a particulat model, code. which is not needed or used in the bootstrap.  This allows us to compare the correct (but usually unknown model) distribution with the bootstrap distribution of a particular r.v. needed for a confidence interval calculation. 

boot-1A.r another bootstrap example

 After this we look at some packages in R that implement the nonparametric bootstrap. Two commonly used R packages for for bootstrap are boot (by A Canty, McMaster University) and bootstrap (by Efron and Tibshirani, Stanford University) Tibshirani is a Canadian from Toronto. Each package allows the user to make their own test statistic or statistic to use in the bootstrap. However each requires writing the function in a particular fashion for the package.

boot-2.r example using boot (this package explicitly uses the random indices to simulate from the EDF)

boot-ET.r an example using bootstrap. In this package the random indices are hidden from the user.

 boot-2samp.r a specific example using 2 samples



 

March 26, 2018 : We now return to the material from Section 8.6 Bayes methods

BayesMethods.pdf

 

March 28, 2018

Term test 2 results :

summary(GR) (exclude one very low grade)

Min. 1st Qu. Median Mean 3rd Qu. Max.

9.00 25.00 30.00 29.67 34.00 48.00

Histogram March2018hist.pdf

The questions were given approximately equal weights, of 17, 17 and 16 marks.

 In order to compensate for the test being a bit long, the test will be graded out of 45. For students with grades higher than 45 their grades will have to be truncated if needed in calculation of their final course grade.



 April 4, 2018

Some Bayes examples coded in R. There are some R packages that can be used, however these are coded directly.

Bayes-2.R

Bayes-3.R

Coin tipping example with a non conjugate prior

cointipping.csv data from some of Dr Murdoch’s classes over various years.

cointipping1.r



Note there is a correction to assignment 5 posted at 2:30 PM April 6, 2018.



April 9, 2018

Reminder : Exam April 16, 2018 from 7 – 10 PM.

The exam is cumulative, so all material covered will be exam topics. However since students do not have access to computing and in particular R, questions involving R will not be on the exam, beyond questions that ask for an algorithim or steps, such as for bootstrap. The coverage includes all handouts. These correspond to the text material, plus some additions. The text sections are

Chapter 8 : 8.1 and 8.2 are just introductory comments.

8.3, 8.4, 8.5, 8.6 (except 8.6.3), 8.7, 8.8

Chapter 9 : 9.1, 9.2 (skip 9.2.3), 9.3, 9.4, 9.8 QQ plots, 9.9 will not be on the test

Chapter 11 : 11.2.1 (11.2.4 is also a Bayesian approach so it is just an example of section 8.6)

Chapter 13 : 13.3 , 13.4 (these are just applications of GLR section 9.4)

Topics such as the nonparametric bootstrap and Bayes methods have not been tested yet, so they have a higher chance of being on an exam. Last year’s final is posted on OWL.