# Today we explore confidence intervals, and the magical role of of the square root. # Example: "The hat check problem". Involves the procedure of picking "Hats" people who put their hats in a bucket, which then get mixed up, and are finally returned randomly to these "Hats" people. We can simulate the problem and count the number of people who get their own hat. HatCheck<-function(Hats){ rbind(1:Hats,sample(1:Hats,Hats,replace=FALSE)) } Hats=100 HC1=HatCheck(Hats) HC1 sum(HC1[1,]==HC1[2,]) # What we like to do is take these "Hats" worth of people and then perform the hat check procedure for "Trials" number of times. Let us call an experiment a success if NOBODY get their own hat back. Our program will spit out the number of successes. ManyHatChecks<-function(Trials,Hats){ T=0 for (i in 1:Trials){ HC1=HatCheck(Hats) tot=(1.0)*(sum(HC1[1,]==HC1[2,])) #print(tot) T=T+1.0*(tot==0) } T/Trials } Trials=100 Hats=21 hatp=ManyHatChecks(Trials,Hats) hatp # We can then find the C% confidence interval about the point. C=95 hatq=1-hatp SE=sqrt((hatp*hatq)/Trials) SE zstar=abs(qnorm((1-(C/100))/2,0,1)) zstar c(hatp-zstar*SE,hatp,hatp+zstar*SE) # We now would like to verify that roughly C% of the confidence interval contain the true value. This requires exploring taking "TrialsOfTrials" worth of trials of trials. You should dwell on this! Conf<-function(TrialsOfTrials,Trials,Hats,C){ H=c() for (i in 1:TrialsOfTrials){ hatp=ManyHatChecks(Trials,Hats) hatq=1-hatp SE=sqrt((hatp*hatq)/Trials) zstar=abs(qnorm((1-(C/100))/2,0,1)) H<-rbind(H,c(hatp-zstar*SE,hatp,hatp+zstar*SE)) } H } TrialsOfTrials=100 Trials=100 Hats=21 C=95 Confs=Conf(TrialsOfTrials,Trials,Hats,C) Confs # Finally we should look at this list and see how many contain the true probability of success. #left and right allow us to pick the window in which we watch the following action. left=0 right=1 #left=.1 #right=.6 plot(c(left,left,right,right),c(TrialsOfTrials+1,0,0,TrialsOfTrials+1),type="l",main=paste(TrialsOfTrials," Conf. Ints. each using",Trials,"Trials. Hats=",Hats), ylab="Trials",xlab="Ratios of One") for (j in 1:TrialsOfTrials){ points(Confs[j,c(1,3)],c(j,j),type="l",col="red") points(Confs[j,2],c(j),pch=18) #points(Confs[j,1],c(j+1),pch=40) #points(Confs[j,3],c(j+1),pch=41) } ############### # Hard Exercise: Compute the true probability of success (The solution is sketched below, and the answer is very nearly 1/e (1/e plus or minus 1/((Hats+1)!)).) Easy Question: Why is this remarkable. Dwell on this! #We can now see in our above graph the fraction of our TrialsOfTrials confidence interval that actually contain the true value of p=1/e. p=1/exp(1) Sd=sqrt((p*(1-p))/Trials) zstar=abs(qnorm((1-(C/100))/2,0,1)) points(c(p,p),c(0,TrialsOfTrials+1),type="l",col="blue") points(c(p-zstar*Sd,p-zstar*Sd),c(0,TrialsOfTrials+1),type="l",col="green") points(c(p+zstar*Sd,p+zstar*Sd),c(0,TrialsOfTrials+1),type="l",col="green") ############ #Solution to challenging Exercise: The probability that none gets their hat back is 1-P(Someone Does). This probability is the {1st person gets their own hat} or {2 nd person gets their own hat} or...or {Hats th person gets their own hat}. That k specified people receive their own hats is 1/(N*(N-1)*...*(N-(k-1)), the we have letting N="Hats" the need probability is # N (1/N) - choose(N,2)*1/(N*N-1)+...+(-1)^N choose(N,2)*1/(N*N-1*...*N-(N-1)) #=1-1/(2!)+..+(1-)^N!(1/N!) approx -exp(-1)+1=1-1/e # Here we use the inclusion exclusion principle (look it up!) and the fact from calculus that exp(x)= sum x^k/factorial(k). For negative x it alternates, so the difference between the need probability and 1/e is less than 1/factorial(N+1). Notice 1/factorial(N+1) is VERY small even for N=21.