\documentclass[12pt]{article}
\begin{document}
\newcommand{\bd}{\displaystyle}
\begin{center}
{\bf \large Problem Set 7 Solution}\\
17.872 TA Jiyoon Kim\\
Nov. 26, 2003\\
\end{center}
{\bf Bulmer Exercise 9.1}\\
Out of 929 purple $\times$ white breed, Mendel observed 705 purple flowers and 224 white flowers. The probability
of observed purple flowers .7589 and expected probability is .75. In order to test the hypothesis that the
probability of a purple-flowered plant is .75, we first need to get $z-score$ (or $d$ in Bulmer's book). \\
\[
z = \bd \frac{|.7589 - .75|}{\sqrt{.7589(1-.7589)/929}} = |.63| =
\pm .63
\]
If you look at the $z$-table, the probability that $z$ is higher
than $\pm$ .63 is .53, therefore, it is not significant.\\
\\
{\bf Bulmer Exercise 9.6}\\
(a) If $P = \frac{1}{2}$, the expected number of successes per
throw is 6. What is observed is not the same. First, I calculate
the probability of getting the outcomes by dividing each results
by 4096. Using $E(x) = \sum xP(x)$, the expected value is
obtained, 6.139. Variance of success ($\sigma^2$) is $p\times
(1-p)\times 12 = 3$. Since we are comparing the sample mean and
theoretical mean, we need to divide the variance by the sample
size of 4096. Square rooting this result will yield standard
deviation. Then, \\
\[
d = \bd \frac{6.139 - 6}{.027} \approx 5.14
\]
This is way high $d$ value and statistically significant.
Therefore, we can reject the hypothesis.\\
\\
(b) The way to proceed this question is to calculate the binomial
distribution based on a newly observed probability $p = .5116$ for
each success. Once it is done, you will have actual results given
with the question and expected number of successes based $p =
.5116$'s binomial distribution.
You need to use $\chi^2$-distribution of goodness of fit test. \\
\[
\sum \bd \frac{(O - E)^2}{E}
\]
The result is $\chi^2 = 8.44$ with 11 degrees of freedom. (The
book combined the first and the second cells and the eleventh and
twelfth cells, since there is no case for the first and the last
cell. In that case, it has $\chi^2 = 7$ with 8 degrees of
freedom.) This is not significant, either.\\
\\
{\bf S \& C 4.8.1}\\
(i) The probability that sample estimate of $\bar{X}$ is within $\pm \$ 500$, \\
\[
\begin{array}{llllll}
\Pr \left[ |\bar{X} - \mu| \leq 500 \right] &=& \Pr \left[ \bd
\frac{|\bar{X} -
\mu|}{\sigma/\sqrt{n}} \leq \bd \frac{500}{\sigma/\sqrt{n}} \right]\\
\\
&=& \Pr\left[ |z| \leq \bd \frac{500}{1800/\sqrt{40}} \right] = 1.757\\
\end{array}
\]
The probability that $z$ is less than $\pm 1.77$ is approximately
.921.\\
\\
(ii) Usually, we need 95 percentage probability to ensure the
statistical significance, at which $z$ is 1.96. In order to meet
this critical value,
\[
\begin{array}{lllll}
\Pr \left[ \bd \frac{|\bar{X} -
\mu|}{\sigma/\sqrt{n}} \leq \bd \frac{500}{1800/\sqrt{n}} \right]\\
\\
\bd \frac{500\sqrt{n}}{1800} \geq 1.96\\
\\
\sqrt{n} \geq 1.96 \times \frac{1800}{500} = 49.787\\
\end{array}
\]
Therefore, $n$ should at least 50 to guarantee the stipulation at
95 percent level.\\
\\
{\bf S \& C 4.8.3}\\
Given information of $\mu = 30, \bar{X} = 29.823, \sigma = 10$ and
$n =
5110$, \\
(i) 95 \% confidence interval is \\
\[
29.823 \pm 1.96 \times 10/\sqrt{5110} = 29.549, 30.097
\]
\\
(ii) Since $\mu$ is 30, the confidence interval does cover the
$\mu$.\\
\\
{\bf S \& C 4.8.8}\\
Given information of $\mu = 0$ and $\sigma = .29$. \\
\[
\begin{array}{llll}
\Pr \left[ 100|error| \leq 5 \right] &=& \Pr \left[ \bd
\frac{100|error|}{.29/\sqrt{100}} \leq \bd \frac{5}{.29/\sqrt{100}} \right]\\
\\
&=& \Pr\left[\bd \frac{|error|}{.29/\sqrt{100}} \leq \bd
\frac{.5}{.29} \right] = \Pr\left[z \leq 1.724 \right]\\
\end{array}
\]
The probability that $z$ is less than 1.724 can be looked up in
the $z$-table. We can approximately calculate that $p = .916$ at
this $z$ score.\\
\\
{\bf S \& C 4.8.9}\\
First, $\sigma = 60$ and $\alpha = .05$. Error is supposed to be
within $\pm \$ 20$.\\
\[
\begin{array}{lllll}
\Pr\left[|\bar{X} - \mu| \leq 20 \right] &=& .95\\
\\
\Pr\left[\bd \frac{|\bar{X} - \mu|}{60/\sqrt{n}} \leq \bd \frac{20}{60/\sqrt{n}} \right] &=& .95
\end{array}
\]
Inside the equation needs to be at least 1.96, therefore,\\
\[
\begin{array}{llll}
1.96 \leq \bd \frac{20}{60/\sqrt{n}}\\
\\
\bd \frac{\sqrt{n}}{3} \geq 1.96\\
\\
n \geq (1.96 \times 3)^2 = 34.57
\end{array}
\]
Therfore, $n$ should be larger than 35. (It would be 36 if you use
2 for simplicity instead 1.96.)\\
\\
{\bf S \& C 5.5.1}\\
$\sigma = 20, \bar{X} = 56.53$ and $\mu = 50$. \\
(i) When $n = 25$,\\
\[
\begin{array}{llll}
z &=& \bd \frac{\bar{X} - \mu}{\sigma/sqrt{n}} = \bd \frac{56.53 -
50}{20/\sqrt{25}}\\
\\
&=& 1.6325\\
\end{array}
\]
At $z = 1.6325$, $\Pr[z \leq 1.6325] \approx .8968$. Thus, the
probability that is getting a larger deviation is 1 - .8968 =
.1032.\\
\\
(ii) When $n = 64$,\\
\[
\begin{array}{llll}
z &=& \bd \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} = \bd \frac{56.53 -
50}{20/\sqrt{64}}\\
\\
&=& 2.612\\
\end{array}
\]
Now, $z$ is .991 and the probability that is getting larger
deviation is 1 - .991 = .009.\\
\\
{\bf S \& C 5.5.2}\\
(i) Confidence interval with $n = 25$ is\\
\[
\bar{X} \pm 1.96 \sigma/\sqrt{n} = 56.53 \pm 1.96 \times 20/\sqrt{25} = 48.69, 64.37\\
\]
\\
(ii) Confidence interval with $n = 64$ is\\
\[
\bar{X} \pm 1.96 \sigma/\sqrt{n} = 56.53 \pm 1.96 \times 20/\sqrt{64} = 51.63, 61.43\\
\]
\\
{\bf S \& C 5.5.4}\\
$\mu = 500, \sigma = 100, \bar{X} = 472$ and $n = 25$.\\
\[
\begin{array}{lllll}
H_0 : \mu \geq 500\\
H_1 : \mu < 500\\
\end{array}
\]
First, calculate $z$.\\
\[
\begin{array}{llll}
z &=& \bd \frac{472 - 500}{100/\sqrt{25}} = -1.4\\
\end{array}
\]
At $z = -1.4$, $p = .5 - .4192 = .0808 \approx .081$.\\
\\
{\bf S \& C 5.6.1}\\
(i) Since $\phi = 2$ at $\alpha = .05$, $Z_1 = -2 -1.96 = -3.96$ and $Z_2 = -2 + 1.96$. $\Pr(Z < Z_1) = 0$ and
$\Pr(Z > Z_2) = .5 + .016 = .516$. Therefore, the power is .516.\\
(ii) With $\phi = 3.5$ and $\alpha = .01$, we only need to change numbers. \\
$Z_1 = -3.5 - 2.576 = -3.076$ and $Z_2 = -3.5 + 2.576 = -.924$. $\Pr(Z < Z_1) = 0$ and $\Pr(Z > Z_2) = .5 + .322
= .822$.\\
(iii) Now, it is one-tailed test. $Z_1 = - 3 + 1.645 = -1.355$ and $\Pr(Z > Z_1) = .5 + .4115 = .9115$.\\
(iv) $Z_1 = - 1.5 + 2.326 = .826$ and $\Pr(Z > Z_1) = .5 - .294 = .206$.\\
\\
{\bf S \& C 5.12.1}\\
For 90\% confidence interval for $\chi^2$, you need to get
$\chi^2_{.95}$ and $\chi^2_{.05}$, under which 90\% of probability
will be in. In the book, $\chi^2_{.05} = 79.08$ and
$\chi^2_{.95)}= 43.19$ with 60 degrees of freedom. (n-1 = 61 - 1 = 60) Assume that $s = 10.2$.\\
\[
\begin{array}{lllll}
\chi^2 &=& \bd \frac{S^2}{\sigma^2} = \bd
\frac{s^2(n-1)}{\sigma^2}\\
\\
&=& \bd \frac{10.2^2 \times 60}{\sigma^2}\\
\end{array}
\]
\[
\begin{array}{lllll}
\chi^2_{.95} < \chi^2 < chi^2_{.05} &=& 43.19 < \chi^2 < 79.08\\
\\
&=& 43.19 < \bd \frac{10.2^2 \times 60}{\sigma^2} < 79.08\\
\end{array}
\]
Then,
\[
\begin{array}{llll}
\sigma^2 > \bd \frac{10.2^2 \times 60}{79.08}\\
\\
\sigma^2 < \bd \frac{10.2^2 \times 60}{43.19}\\
\end{array}
\]
These inequalities result in \\
$\sigma > 8.88, \ \ \sigma < -8.8$ and $-12.2 < \sigma < 12.2$.
The ranges which covers both conditions are $-12.2 < \sigma <
-8.8$ and $8.8 < \sigma < 12.2$. Since standard deviation cannot
be less than zero, we only take the latter range, $8.8 < \sigma <
12.2$.\\
\\
{\bf S \& C 5.12.6}\\
Goodness of fit test. $\sum \frac{(O - E)^2}{E}$ will give you
$\chi^2$ with degrees of freedom of 5. $\chi^2 = 6.042$ and with
5 degrees of freedom, we need to have $\chi^2$ at 95\% level
larger than 11.07. Wince 6.042 is less than 11.07, we fail to
reject the null hypothesis. \\
\\
{\bf S \& C 7.4.2}\\
The probability of a train arrives on time is .95. \\
(i) all three arrive on time : $.95^3 = .857$.\\
(ii) one of the three is late : I think about this in reversed
way. I put being late as success and being on time as fail. Using
binomial probability, we can calculate $C_{3,1}.05^1 \times .95^2
= .135$.\\
\\
{\bf S \& C 7.5.1}\\
(i) Both event 1 and 3 happen : .11 + .04 = .15\\
(ii) Neither even 1 nor 3 happens : .1 + .4 = .5\\
(iii) At least one of events 1 and 2 happens : 1 - .6 = .4\\
(iv) At least one of events 1,2 and 3 happens : 1 - .4 = .6\\
(v) Verify that in (iv) the probability is less than $p_1 + p_2 +
p_3$ : \\
$p_1 = .11 + .03 + .04 + .04 = .22$\\
$p_2 = .11 + .03 + .08 + .1 = .32$\\
$p_3 = .11 + .04 + .08 + .2 = .43$\\
$p_1 + p_2 + p_3 = .97$ And this is larger than .6.\\
\\
{\bf S \& C 7.5.2}\\
(i) If $p_i$ is the probability that one event is happening, $q_i = 1 - p_i$ is the probability that an event is
not happening. The probability of each of four event happens is 1 minus none of event happens, which will
guarantee at least one
event happens. Therefore, \\
\[
\begin{array}{llll}
1 - \sum_{i=1}^4 q_i = 1 - 4q_i \geq .6\\
.4 \geq 4q_1 \equiv .1 \geq q_1\\
.1 \geq 1 - p_i
\end{array}
\]
Therefore, $p_i \geq .9$.\\
(ii) If events are independent, the probability of all four
events happen is $p^4 = .9^4 = .6561$.\\
\\
{\bf S \& C 7.6.3}\\
$n = 96$ and $p = .4$.\\
(i) Standard deviation of the number is $\sigma = \sqrt{.4 \times
.6 \times 96} = 4.8$\\
(ii) Standard deviation of the percentage of successes is $ \sigma
= \sqrt{\frac{.4 \times .6}{96}} = .05$. Therefore, 5 \%.\\
\\
{\bf S \% C 7.6.4}\\
Since $p = .1$ and $\sigma = .01$, we can set up the equation
that\\
\[
.01 = \sqrt{\frac{.1 \times .9}{n}}
\]
Therefore, $n = 900$.\\
\\
{\bf S \% C 7.6.5}\\
For $n = 10, p = .5$,\\
(i) Using binomial distribution, you count from 0 to 4 successes.
\\
\[
C_{10,0}.5^0.5^{10} + C_{10,1}.5^1.5^9 + C_{10,2}.5^2.5^8 +
C_{10,3}.5^3.5^7 + C_{10,4}.5^4.5^8 = .377\\
\]
(ii) the normal approximation corrected for continuity is \\
\[
z_c = (|r - np| - .5)/\sqrt{npq} = (|4-5| - .5)/1.581 = .5/1.581 =
.316\\
\]
$\Pr(z_c > .316)$ is same as the probability of $\Pr(z_c < -.316)$
and it is $.5 - \Pr(z_c < .316) \approx .5 - .124 = .376$.\\
(iii) the normal approximation uncorrected for continuity is\\
\[
z = (|r - np|)/\sqrt{npq} = (|4-5|)/1.581 = 1/1.581 = .6325\\
\]
$\Pr(z > .6325) = .5 - .237 = .263$.
\\
\end{document}