# France and America Lecture

# Today we run a VERY basic Hypothesis Test.  Though contrary to the book's tests, we assume that our defendant ends up charged in both America and in France for the same crime! In other words, first our defendant is assumed innocent and found "guilty" or "not Guilty",  and then our defendant is assumed guilty and found "innocent" or "not innocent".   In particular, in our situation will have  exactly two hypothesis.  Namely we are to be given Roulette Data, but we don't know whether the wheel is American of French.  In America a roulette wheel has 18 red, 18 black and two green pockets. In France the wheel only has 37 pockets, namely 18 red, 18 black, and only a single green  (This is true throughout Europe, and, not so surprisingly, Roulette is much more popular in Europe!).    We want to run a test which indicates whether we should believe our data comes from an American roulette wheel or a French roulette wheel.  Our data will be the results of a bet on red bet, hence we may phrase these hypotheses as:

# H0: p=p0=18/38 (American)

# H1: p=p1=18/37 (European)

# Using this notation,  we are viewing the notion that the wheel is American as the Null hypothesis, while the French alternate is being viewed as the Alternate Hypothesis.  (We will discuss this more carefully next week -  but, as in this case, this distinction is sometimes a rather arbitrary one. )  After looking at the data we must decide whether to accept p=p0 or p=p1, and we let  pd be the value of p that we decide to accept (Important Exercise:  compare this discussion with the book's discussion of when to use the term "accept").    We'd like to know the chance of errors:  

# Type 1: alpha = P(pd=p1| p=p0)

# Type 2: beta = P(pd=p0 | p=p1)

# Notice this test's Power to reject H0 when it is false is 1-beta.

# Once we have chosen N and alpha we can derive beta.  But in this setting one can approach the problem in a more intelligent way.  Namely, we can decide just  how much more relative risk we are willing to take for each type of error.  For simplicity,  we will assume equal risks, and hence we'll chose our test to satisfy alpha=beta. In this setting, such a symmetry belief would follow form the belief that both alternatives appear roughly equally likely to us before we run the experiemnt. But in general we might say this...

# SYMMETRY BELIEF: We would be comfortable with a test where the risk associated to Rejecting p=p0 when p=p0 is true is the SAME as the risk associated to Rejecting p=p1 when p=p1 true is true.  

# Under the SYMMETRY BELIEF we can compute the critical parameter pstar as follows: (assuming the CLT has kicked in) alpha=beta is equivalent to  pstar=p0+z*sqrt(p0*q0/N) = p1-z*sqrt(p1*q1/N).  So z = sqrt(N)*((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))  and hence pstar=p0+((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))*sqrt(p0*q0).  

p0=18/38 
p1=18/37
q0=1-p0
q1=1-p1
pstar=p0+((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))*sqrt(p0*q0)
# if you reverse which of p0 and p1 is bigger you get the same answer. rewriting pstar as follows makes this a little clearer: ((p1+p0)/2)*(1 - ((p1-p0)*(sqrt(p1*q1)-sqrt(p0*q0)))/((p1+p0)*(sqrt(p0*q0)+sqrt(p1*q1))))
pstar

# Now N will determine the risk alpha=beta and the risk determine N.  We need to choose one and so we can use the following graph to explore our choice.

p0=18/38 
 p1=18/37
q0=1-p0
q1=1-p1
N0=40000 
Npict=seq(1,N0,by=100)
zpict = sqrt(Npict)*((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))
betapict=pnorm(-zpict,0,1)
plot(Npict,betapict,type="l",main="Prob(Error) as a function of Sample Size",xlab="Sample Size",ylab="Probability")
points(c(1,N0),c(.05,.05),type="l",col="red")
points(c(1,N0),c(.1,.1),type="l",col="red")
points(c(1,N0),c(.01,.01),type="l",col="red")

# For each N we plot the situation:

N=100000
p0=18/38 
p1=18/37
q0=1-p0
q1=1-p1
pstar=p0+((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))*sqrt(p0*q0)
a=.45
b=.55
X=seq(a,b,by=0.001)
plot(X,dnorm(X,p0,sqrt(p0*q0/N)),type="l",main=paste("For N=",N," we see our hypotheses") ,xlab="phat",ylab="Probability Density")
points(X,dnorm(X,p1,sqrt(p1*q1/N)),type="l",col="red")
m=max(dnorm(X,p0,sqrt(p0*q0/N)))
points(c(pstar,pstar),c(0,m),type="l",col="blue",lwd=2)
points(c(p0,p0),c(0,m),type="l",col="black")
points(c(p1,p1),c(0,m),type="l",col="red")

#############################

#To get the exact value of beta=alpha for a given N, we use

N=10000
p0=18/38 
p1=18/37
q0=1-p0
q1=1-p1
z = sqrt(N)*((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))
beta=pnorm(-z,0,1)
beta

# while to get the exact N for a given beta=alpha

beta=0.05
p0=18/38 
p1=18/37
q0=1-p0
q1=1-p1
N= (qnorm(beta,0,1)*(sqrt(p0*q0)+sqrt(p1*q1))/((p1-p0)))^2
ceiling(N)

#############################

# Typical procedure: N=price and alpha=beta=risk.   Find the price  corresponding to the  maximal risk that you are willing to take.  If this price is too "expensive" cancel the experiment and devise a new one. If not , then ask yourself: "How much you can I afford?" and see how much this shrinks your risk.  Somewhere in the middle you will find a test to run that balances your risk and price in an acceptable way. 

# Exercise: Choose a good N in our case.  

###### Okay lets do it!

N=10000

p0=18/38 
 p1=18/37
q0=1-p0
q1=1-p1
pstar=p0+((p1-p0)/(sqrt(p0*q0)+sqrt(p1*q1)))*sqrt(p0*q0)
pstar  

# I'll pick a test value for p, call it pt and actually run the experiment:

pT=p1
Successes=rbinom(1,N,pT)
pThat=Successes/N
pThat

# Exersice: State you conclusion. In particular: 

# What is pd's actual value and what is the P-Vlaue=P(phat>pThat | p=p0)?

P=1-pnorm((pThat-p0)/(sqrt((p0*q0)/N)),0,1)
P

# Why is  P(phat<pThat | p=p1) potentially just as interesting in this case? 

P1=pnorm((pThat-p1)/(sqrt((p1*q1)/N)),0,1)
P1

# In this case we really are accepting the hypothesis that we fail to reject.    A key part of why we can and should feel comfortable about our accepting the hypothesis that our tests suggests is our Symmetry Belief.

# Exercise: Suppose before the test we felt that the wheel was much more likely to be American then French how does this change our interpretation of the results?  Was the SYMMETRY BELIEF in this justifiable?