## 3.4.3 The Error in Estimating the Variance

Measurable Outcome 3.8, Measurable Outcome 3.11, Measurable Outcome 3.12

## The Error in Estimating the Variance

The variance of \(y\) is given the symbol, \(\sigma _ y^2\), and is defined as,

We saw in the previous unit that an unbiased estimator of \(\sigma _ y^2\) is \(s_ y^2\), that is:

Note, you should try proving this result.

To quantify the uncertainty in this estimator, we would like to determine the standard error,

Unfortunately, this standard error is not known for general distributions of \(y\). However, if \(y\) has a normal distribution, then,

Under the assumption of \(y\) being normally distributed, the distribution of \(s^2_ y\) is also related to the chi-squared distribution. Specifically, \((N-1)s^2_ y/\sigma ^2_ y\) has a chi-square distribution with \(N-1\) degrees of freedom. Note that the requirement that \(y\) be normally distributed is much more restrictive than the requirements for the mean error estimates to hold. For the mean error estimates, the standard error, \(\sigma _{\overline{y}} = \sigma _ y/\sqrt {N}\), is exact regardless of the distribution of \(y\). The application of the central limit theorem which gives that\(\overline{y}\) is normally distributed only requires that the number of samples is large but does not constrain the distribution of \(y\) itself (beyond requiring that \(f(y)\) is continuous).

## Standard Deviation

Typically, the standard deviation of \(y\) is estimated using \(s_ y\), i.e. the square root of the variance estimator. This estimate, however, is biased,

The standard error for this estimate is only known exactly when \(y\) is normally distributed. In that case,