SS459/559b Winter 2000
Assignment 3
Note: You have to email your Splus codes (script codes) to hyu@stats.uwo.ca besides the usual hardcopy of your assignments.
1.
Implement qqplot and
pplot functions for one of
distributions: Gamma or Weibull, by following steps. The principle of qqplot was discussed in class
(a scatter plot of Qn(F(x)) against x=Q(p)). Do some error checking on the data
x in your codes.
a)
Write a qqplot
function with three arguments:
1)
x a vector (data)
2)
shape shape parameter
3)
rate (scale) rate
parameter for Gamma (scale parameter for Weibull)
Then the qqplot will do a scale plot of the
empirical quantile function against the theoretical quantile function. Compare
you qqplot with the trellis function qqmath by a example.
b)
Write a ppplot
function with three arguments:
1)
x a vector (data)
2)
shape shape parameter
3)
rate (scale) rate
parameter for Gamma (scale parameter for Weibull)
Then the ppplot will do a step function plot
of Fn(Q(p)) against p, where Fn is the empirical distribution function and Q is
the theoretical quantile function.
Since all points are in (0,0) to (1,1) box, a line from (0,0) to (1,1)
should be in the plot which can be used to make compare with the empirical step
function. Show your code works by a example.
c)
Write a function to
estimate the two parameters based on methods of moments.
d)
Using the results of c) as
initial values, write a function to do MLE estimation of the two parameters.
The function should only have one argument x and returns two estimators.
e)
Rewrite a) and b) with one
argument x only. The parameter values are taken from d).
f)
Run some simulation. Say,
simulate two sets of data from Gamma and Weibull (n=100), then apply your
qqplot and qqplot. Comments on similarity and difference of two plots.
2.
Write a function to run a
simulation to compute the relative efficiency of median to mean under
contaminated normal distribution.
a)
Use (create) a function to
simulate contaminated normal distribution discussed in class.
b)
Then create a function to
loop a) m times to simulate m sets of data. Use these datasets to compute mean
and median and then estimate variances for median and mean. After that a value
of RE of median and mean can be returned from a function. (you can combine a)
and b) together in order to avoid looping if you want to try).
c)
Write a function to redo
b) l times and return mean and standard deviation of RE values and a histogram
of RE values.
d)
Run a set of simulation
with n=100, m=100, l=10, mu=0, sig=1, k = 3 for eps=0%, 1%, 5%, 10%. Report
your finding.