LIS 4273 Adv Statistics: Module 6 Sampling and Confidence Interval Estimation

For this week’s blog post I make an attempt to answer questions my professor gave me regarding sampling and confidence interval estimation. For the R code I used, you can find it here or use the link at the end of the article.

Question 1

A publishing company has just published a new textbook. Before the company decides the price at which to sell this textbook, it want to know the average price of all such textbooks in the market. The research department at the company took a sample of 25 comparable textbooks and collected information on their prices. This information produced a mean price of $145 for this sample. It is known that the standard deviation of the prices of all such textbooks is $35 and the population of such prices is normal.

A. What is the point of estimate of the mean price all such textbooks?

The sample size is 25, the sample mean is $145, and the standard deviation is $35. The sample mean of $145 with a standard deviation of $35 can be used as the point estimate of the mean price of all such textbooks.

B. Construct a 90% confidence interval for the mean price of all such textbooks

Our population mean is unknown, however we do have a standard deviation. With a 90% confidence interval we know that our alpha is 0.10. Alpha divided by two would be 0.05 which can be plugged into z or qnorm(). We can then find margin of error from there to get a confidence interval for 90%. This is summarized in the following R Code.

# Construct 90% Confidence Interval for population
# Givens
# Assume data set is normal
n = 25
sample_mean = 145
std_dev = 35

margin_err <- qnorm(0.95) * (std_dev / sqrt(n))

sample_mean - margin_err
sample_mean + margin_err
Code language: R (r)

The lower and upper bounds of the interval from these calculations are $133.486 and $156.514 respectively. From this one can say that out of all similar textbooks, there is a 90% confidence level that the average price is between $133.49 and $156.51.

Question 2

According to Mobes Services Inc. an individual checking his/her account at major U.S banks via cellphones cost the banks between $350 and $650. A recent random sample of 600 such checking accounts produced a mean annual cost of $500 to major U.S banks. Assume that the standard deviation of annual costs to major US banks of all such checking account is $40. Make a 99% confidence interval for the current mean annual cost to major banks all such checking account.

Here a large sample of 600 is given, of which the sample mean is $500 and a standard deviation of $40 is also given. With a 90% confidence interval, the alpha will be 0.01, so alpha over 2 will be 0.005. So for the qnorm function, I will plugin 0.995. This is used in essentially the same R Script outlined below.

# Construct 90% Confidence Interval for population
# Givens
# Assume data set is normal
n = 600
sample_mean = 500
std_dev = 40

margin_err <- qnorm(0.995) * (std_dev / sqrt(n))

sample_mean - margin_err
sample_mean + margin_err
Code language: R (r)

The lower and upper bounds outputted by this script are $495.794 and $504.206 respectively. Thus it can be said that the mean cost to US banks via cellphones with a 99% confidence interval is between $495.79 and $504.20.

Extra stuff for fun

Being the coder that I am, for giggles I have gone ahead and generalized the code I used above so that any values input would give the results as a vector. This function does assume that the standard deviation is known and that the data set is normal. A different function would have to be written for the other cases.

confidence_interval <- function(confidence, n, mean, std_dev) {
  # Assumes mean and std deviation are known
  # Returns a vector with low and high values of the given confidence interval
  # confidence = confidence level
  # n = sample size
  # mean = sample mean
  # std_dev = standard deviation
  
  margin_err <- qnorm(1 - ((1 - confidence) / 2)) * (std_dev / sqrt(n))
  
  low <- mean - margin_err
  high <- mean + margin_err
  
  return(c(low, high))
}
Code language: R (r)

Links

GitHub: https://github.com/SimonLiles/LIS4273AdvStatistics/blob/master/LIS4273Mod6.R