LIS 4273 Adv Statistics: Module 7 Confidence Interval Estimation and Hypothesis Testing

Part 1. Confidence Interval Estimation Questions when the mean is known

Equation 1: Pulled from assignment page

1. x̄ = 85 and σ = 8, and n = 64, set up a 95% confidence interval estimate of the population mean μ. 

Using the given equation and plugging in for the appropriate values, the following results:

85 – 1.96(8 / sqrt(64)) < 85 < 85 + 1.96(8 / sqrt(64))

85 – 1.96 < 85 < 85 + 1.96

83.04 < 85 < 86.96

The 95% confidence interval with these values is (83.04, 86.96), so there is a 95% probability of the population mean to be between 83.04 and 86.96.

2. If  x̄ = 125, σ = 24 and n = 36, set up a 99% confidence interval estimate of the population mean μ. 

Using the given equation and plugging in for the appropriate values, the following results:

125 – 2.58(24 / sqrt(36)) < 125 < 125 + 2.58(24 / sqrt(36))

125 – 10.32 < 125 < 125 + 10.32

114.68 < 125 < 135.32

The 99% confidence interval with these values is (114.68, 135.32), so there is a 99% probability of the population mean to be between 114.68 and 135.32.

3. The manager of a supply store wants to estimate the actual amount of paint contained in 1-gallon cans purchased from a nationally known manufacturer. It is known from the manufacturer’s specification sheet that standard deviation of the amount of paint is equal to 0.02 gallon. A Random sample of 50 cans is selected and the sample mean amount of paint per 1 gallon is 0.99 gallon. 

3a. Set up a 99% confidence interval estimate of the true population mean amount of paint included in 1-gallon can? 

The sample mean, X bar is 0.99 gallons, sigma is 0.02, and n is 50. For a 99% confidence interval we use a z score of 2.58:

0.99 – 2.58(0.02 / sqrt(50)) < 0.99 < 0.99 + 2.58(0.02 / sqrt(50))

0.99 – 0.0073 < 0.99 < 0.99 + 0.0073

0.9827 < 0.99 < 0.9973

The 99% confidence interval with these values is (0.9827, 0.9973), so there is a 99% probability of the population mean to be between 0.9827 and 0.9973.

3b. On the basis of your results, do you think that the manager has a right to complain to the manufacturer? why?

The manager could make a case to complain to the manufacturer that the paint cans are under filled because from this analysis, there is a 99% probability that the average 1 gallon paint can holds 0.99 gallons instead of a full 1 gallon. However I would recommend more trials and a larger sample size. There is also a chance that the manager got a bad batch of cans that were slightly under filled and other cans from that manufacturer had been filled properly.

Part 2. When the mean is unknown using confidence for a mean (σ is unknown)

4. A stationery store wants to estimate the mean retail value of greeting cards that has in its inventory. A random sample of 20 greeting cards indicates an average value of $1.67 and standard deviation of $0.32

4a. Assuming a normal distribution set up with 95% confidence interval estimate of the mean value of all greeting cards stored in the store’s inventory. 

Since the population standard deviation is unknown in this case, we can use the sample standard deviation in an attempt to approximate the mean. The result will be less accurate, but will be good enough for a rough approximation. So the values to be used are a sample mean of $1.67, a sample standard deviation of $0.32, and an n of 20. For a 95% confidence interval we use a z-score of 1.96. The equation is the same as before.

1.67 – 1.96(0.32 / sqrt(20)) < 1.67 < 1.67 + 1.96(0.32 / sqrt(20))

1.67 – 0.14 < 1.67 < 1.67 + 0.14

1.53 < 1.67 < 1.81

The 95% confidence interval with these values is (1.53, 1.81), so there is a 95% probability that of all the greeting cards in the stationary store’s inventory is between $1.53 and $1.81.

4b. How might the result obtained in (a) be useful in assisting the store owner to estimate of the mean value of all greeting cards in the store’s inventory.  

The result obtained in (a) might be useful in assisting the store owner to estimate the mean value of all greeting cards in the store’s inventory by providing a potential range to expect the true mean to be within. For most cases, performing a census on the population is too difficult, and thus a sample must be taken instead. Since only a sample was taken, as a data scientist you will not have the any population parameters such as population standard deviation. Thus if the store owner is looking for a quick answer as to what the mean value of all greeting cards in their inventory is, the sample will provide a good enough answer. For a more complete and better answer, more samples can be taken and compared to see if they have similar intervals. If their intervals cross over each other, then you can have more confidence that the population mean is within those intervals.

Part 3. Determining sample size 

Equation 2: Formula for finding minimum sample size for given standard deviation, confidence, and acceptable error.
Pulled from assignment page

Outline for the sample size formula:
Z  = Z value (e.g. 1.96 for 95% confidence level)
σ = standard deviation, can be found through small samples or assumed
E = Acceptable standard error

5. If you want to be 95% confident of estimating the population mean to within a sampling error of  ± 5 and standard deviation is assumed to be equal 15, what sample size is required? 

n = ((1.96 * 15) / 5))^2

n = (29.4 / 5)^2

n = 5.88^2 = 34.57

For the given requirements, a sample size of 35 would be required.

Part 4. Hypothesis Statement 

6. Generate your own  null and alternative hypothesis statements and provide rationale for your selection.

My null hypothesis is that studying more will increase the probability of me passing a course.

Thus my alternative hypothesis is that an increase in studying will have no effect on my passing.

The null hypothesis is the claim itself, more studying leads to passing a course. Then alternative hypothesis is not the opposite of the null hypothesis, but is what occurs outside of the null hypothesis. In this case, an increase in studying has no effect on my passing of a course.