# File:   mit18_05_s22_studio6-grader.r 
# Authors: Jeremy Orloff and Jennifer French
#
# MIT OpenCourseWare: https://ocw.mit.edu
# 18.05 Introduction to Probability and Statistics
# Spring 2022
# For information about citing these materials or our Terms of Use, visit:
# https://ocw.mit.edu/terms.
#
# Studio 6 grading script
# Expected output in studio6-grader.html
# If this file changes --need to rebuild studio*-grader.html

# Use 'File > Compile report...' to create an R Markdown report from this.
# Because this opens a new session, it doesn't see the environment.
# So we need the following line, which should be commented out when using the grading script for grading.

 source('mit18_05_s22_studio6-solutions.r')  ### COMMENT OUT FOR GRADING
 cat("WARNING: make sure source('mit18_05_s22_studio*-solutions.r') is commented out before grading\n")
## WARNING: make sure source('mit18_05_s22_studio*-solutions.r') is commented out before grading
# For grading, open this file and set working directory to source file location
studio6_problem_0()
## 
## ----------------------------------
## Problem 0: Averaging normal distributions

## 0. Both histograms are centered near the true mean mu. The histogram of the averaged data is much narrower, i.e. less spread out from this mean.
studio6_problem_1a()
## 
## ----------------------------------
## Problem 1a. Formula for Cauchy pdf
## 1a. f(x | theta) = 1/(pi*(1+(x-theta)^2))
studio6_problem_1b()
## -----
## 1b. Plot pdf of Cauchy and standard normal

## 1b. See plots
studio6_problem_1c()
## -----
## 1c. Explain fat tails
## 1c. Answer: The tails are the left and right ends of the graph. In its tails the Cauchy graph is much higher than the normal, i.e. the tails of the Cauchy are much fatter than those of the normal.
studio6_problem_1d()
## -----
## 1d. Average of Cauchy distributions

## Notice that averaging the Cauchy distribution does not change the spread of the histogram. Averaging does not help us estimate the location of the pdf!
grader_data_csv = 'mit18_05_s22_studio6_grader_data_frame.csv'
studio6_problem_2a(grader_data_csv)
## 
## ----------------------------------
## Problem 2a. Load and plot data.
##   X    position
## 1 1 -1.23242170
## 2 2  0.70216071
## 3 3 -0.08696137
## 4 4  1.98923714
## 5 5  0.45431845
## 6 6 14.27953900

## 2a. See plot
studio6_problem_2b(grader_data_csv)
## -----
## 2b. Discretized Bayesian updates and MAP estimates.

## [1] "2b(iv) Final map estimate:  0.960000000000001"

## 2b(vi). The posterior has most of its probability for theta between -1 and 1. This is significantly less distance to cover than the initial -10 to 10. I would go to the MAP estimate of 0.96 and search near there. I would also keep collecting data and updating my posterior.
studio6_problem_2c()
## -----
## Problem 2c (OPTIONAL). Explanation

## 2c (OPTIONAL). Here is the explanation.
## 
## Look at the explanatory diagram.
## The angle alpha is random and uniform between -pi/2 and pi/2.
## The figure shows that x-theta = tan(alpha)
## This is a transformation from alpha to x, so we can use it to find the pdf for x.
## As usual, we work with the cdf and take a derivative at the end
## 
## Let X be the random variable whose values x are the random (in time) position meausrements. Let A be the random variable whose values alpha are the random (in time) angles.
## Then, F_X(x) is the cdf of X and F_A(alpha) is the cdf of A
## The figure shows that X-theta = tan(alpha)
## So, F(x|theta) = P(X < x | theta)
##        = P(tan(alpha) < x-theta | theta)
##        = P(alpha < arctan(x-theta) | theta)
##        = (1/pi)*arctan(x-theta)
## The last equality is because A is uniform on [-pi/2,pi/2]
## Now all we have to do is remember how to differentiate artctan in order to find f(x|theta):
## f(x|theta) = (1/pi)*1/(1+(x-theta)^2))  QED