← return to practice.dsc10.com
Below are practice problems tagged for Lecture 19 (rendered directly from the original exam/quiz sources).
You order 25 large pizzas from your local pizzeria. The pizzeria claims that these pizzas are 16 inches in diameter, but you’re not so sure. You measure each pizza’s diameter and collect a dataset of 25 actual pizza diameters. You want to run a hypothesis test to determine whether the pizzeria’s claim is accurate.
What would your Null Hypothesis be?
What would your Alternative Hypothesis be?
Answer:
Null Hypothesis: The mean pizza diameter at the local pizzeria is 16 inches.
Alternative Hypothesis: The mean pizza diameter at the local pizzeria is not 16 inches.
The null hypothesis is the hypothesis where there is no significant difference from some statement. In this case, the statement of interest is the pizzeria’s claims of pizzas are 16 inches in diameter.
The alternative hypothesis is a statement that contradicts the null hypothesis. In this case this statement is that the mean pizza diameter at the local pizzeria is not 16 inches. (ie. the other option to the null hypothesis)
The average score on this problem was 35%.
What test statistic would you use?
Answer: Mean Pizza Diameter, or other valid statistics such as the (absolute) difference between each pizza’s diameter and 16 (expected value)
Looking at the null and alternative hypothesis we can see we are directly interested in the mean pizza diameter, so it is most likely the best measurement for the test statistic. The main idea is that we somehow want to show the difference in distribution of the pizza diameters.
The average score on this problem was 70%.
You order 25 large pizzas from your local pizzeria. The pizzeria claims that these pizzas are 16 inches in diameter, but you’re not so sure. You measure each pizza’s diameter and collect a dataset of 25 actual pizza diameters. You want to run a hypothesis test to determine whether the pizzeria’s claim is accurate.
What would your Null Hypothesis be?
What would your Alternative Hypothesis be?
Answer:
Null Hypothesis: The mean pizza diameter at the local pizzeria is 16 inches.
Alternative Hypothesis: The mean pizza diameter at the local pizzeria is not 16 inches.
The null hypothesis is the hypothesis where there is no significant difference from some statement. In this case, the statement of interest is the pizzeria’s claims of pizzas are 16 inches in diameter.
The alternative hypothesis is a statement that contradicts the null hypothesis. In this case this statement is that the mean pizza diameter at the local pizzeria is not 16 inches. (ie. the other option to the null hypothesis)
The average score on this problem was 35%.
What test statistic would you use?
Answer: Mean Pizza Diameter, or other valid statistics such as the (absolute) difference between each pizza’s diameter and 16 (expected value)
Looking at the null and alternative hypothesis we can see we are directly interested in the mean pizza diameter, so it is most likely the best measurement for the test statistic. The main idea is that we somehow want to show the difference in distribution of the pizza diameters.
The average score on this problem was 70%.
Explain how you would do your hypothesis test and how you would draw a conclusion from your results.
Answer: Answers vary, should include the following
Generate confidence interval for population mean (or equivalent) by bootstrapping (or by calculating directly with the sample).
Correctly describe how to reject or fail to reject the null hypothesis (depending on whether interval contains 16, for example).
When conducting the hypothesis test, we first want to create a confidence interval either by using bootstrapping or constructing a 95% confidence interval to understand the true mean diameter of pizzas. The next step is to define the rejection criteria, failing to reject the null if 16 is within the 95% confidence interval (since we believe the true population mean) is within this range with 95% confidence. We will reject the null hypothesis if 16 is not within the 95% confidence interval. Note that this assumes you used true mean diameter of pizzas as your test statistic.
The average score on this problem was 38%.
In our sample, we have data on 1210 medals for the sport of gymnastics. Of these, 126 were awarded to American gymnasts, 119 were awarded to Romanian gymnasts, and the remaining 965 were awarded to gymnasts from other nations.
We want to do a hypothesis test with the following hypotheses.
Null: American and Romanian gymnasts win an equal share of Olympic gymnastics medals.
Alternative: American gymnasts win more Olympic gymnastics medals than Romanian gymnasts.
Which test statistic could we use to test these hypotheses?
total variation distance between the distribution of medals by country and the uniform distribution
proportion of medals won by American gymnasts
difference in the number of medals won by American gymnasts and the number of medals won by Romanian gymnasts
absolute difference in proportion of medals won by American gymnasts and proportion of medals won by Romanian gymnasts
Answer: difference in the number of medals won by American gymnasts and the number of medals won by Romanian gymnasts
To test this pair of hypotheses, we need a test statistic that is
large when the data suggests that we reject the null hypothesis, and
small when we fail to reject the null. Now let’s look at each
option:
- Option 1: Total variation distance across all the
countries won’t tell us about the differences in medals between
Americans and Romanians. In this case, it only tells how different the
proportions are across all countries in comparison to the uniform
distribution, the mean proportion. - Option 2: This
test statistic doesn’t take into account the number of medals Romanians
won. Imagine a situation where Romanians won half of all the medals and
Americans won the other half, and no other country won any medals. In
here, they won the same amount of medals and the test statistic would be
1/2. Now imagine if Americans win half the medals, some other country
won the other half, and Romanians won no medals. In this case, the
Americans won a lot more medals than Romanians but the test statistic is
still 1/2. A good test statistic should point to one hypothesis when
it’s large and the other hypothesis when it’s small. In this test
statistic, 1/2 points to both hypotheses, making it a bad test
statistic.
- Option 3: In this test statistic, when Americans win
an equal amount of medals as Romanians, the test statistic would be 0, a
very small number. When Americans win way more medals than Romanians,
the test statistic is large, suggesting that we reject the null
hypothesis in favor of the alternative. You might notice that when
Romanians win way more medals than Americans, the test statistic would
be negative, suggesting that we fail to reject the null hypothesis that
they won equal medals. But recall that failing to reject the null
doesn’t necessarily mean we think the null is true, it just means that
under our null hypothesis and alternative hypothesis, the null is
plausible. The important thing is that the test statistic points to the
alternative hypothesis when it’s large, and points to the null
hypothesis when it’s small. This test statistic does just that, so
option 3 is the correct answer.
- Option 4: Since this statistic is an absolute value,
large values signify a large difference in the two proportions, while
small values signify a small difference in two proportions. However, it
doesn’t tell which country wins more because a large value could mean
that either country has a higher proportion of medals than the other.
This hypothesis would only be helpful if our alternative hypothesis was
that the number of medals the Americans/ Romanians win are
different.
The average score on this problem was 73%.
Below are four different ways of testing these hypotheses. In each
case, fill in the calculation of the observed statistic in the variable
observed, such that p_val represents the
p-value of the hypothesis test.
Way 1:
many_stats = np.array([])
for i in np.arange(10000):
result = np.random.multinomial(245, [0.5, 0.5]) / 245
many_stats = np.append(many_stats, result[0] - result[1])
observed = __(a)__
p_val = np.count_nonzero(many_stats >= observed)/len(many_stats)Way 2:
many_stats = np.array([])
for i in np.arange(10000):
result = np.random.multinomial(245, [0.5, 0.5]) / 245
many_stats = np.append(many_stats, result[0] - result[1])
observed = __(b)__
p_val = np.count_nonzero(many_stats <= observed)/len(many_stats)Way 3:
many_stats = np.array([])
for i in np.arange(10000):
result = np.random.multinomial(245, [0.5, 0.5]) / 245
many_stats = np.append(many_stats, result[0])
observed = __(c)__
p_val = np.count_nonzero(many_stats >= observed)/len(many_stats)Way 4:
many_stats = np.array([])
for i in np.arange(10000):
result = np.random.multinomial(245, [0.5, 0.5]) / 245
many_stats = np.append(many_stats, result[0])
observed = __(d)__
p_val = np.count_nonzero(many_stats <= observed)/len(many_stats)Answer: Way 1: 126/245 - 119/245 or
7/245
First, let’s look at what this code is doing. The line
result = np.random.multinomial(245, [0.5, 0.5]) / 245 makes
an array of length 2, where each of the 2 elements contains the amount
of the 245 total medals corresponding to the amount of medals won by
American gymnasts and Romanian gymnasts respectively. We then divide
this array by 245 to turn them into proportions out of 245 (which is the
sum of 126+119). This array of proportions is then assigned to
result. For example, one of our 10000 repetitions could
assign np.array([124/245, 121/245]) to result.
The following line,
many_stats = np.append(many_stats, result[0] - result[1]),
appends the difference between the first proportion in
result and the second proportion in result to
many_stats. Using our example, this would append 124/245 -
121/245 (which equals 3/245) to many_stats. To determine
how we calculate the observed statistic, we have to consider how we are
calculating the p-value. In order to calculate the p-value, we need to
determine how frequent it is to see a result as extreme as our observed
statistic, or more extreme in the direction of the alternative
hypothesis. The alternative hypothesis states that American gymnasts win
more medals than Romanian gymnasts, meaning that we are looking for
results in many_stats where the difference is equal to or
greater than (more extreme than) 126/245 - 119/245 (which equals 7/245).
That is, the final line of code in Way 1 is using
np.count_nonzero to find the amount of differences in
many_stats greater than 7/245. Therefore, observed must
equal 7/245.
The average score on this problem was 56%.
Answer: Way 2: 119/245 - 126/245 or
-7/245
The only difference between Way 2 and Way 1 is that in Way 2, the
>= is switched to a <=. This causes a
result as extreme or more extreme than our observed statistic to now be
represented as anything less than or equal to our observed statistic. To
account for this, we need to consider the first proportion in
result as the number of medals won by Romanian gymnasts,
and the second proportion as the number of medals won by American
gymnasts. This flips the sign of all of the proportions. So instead of
calculating our observed statistic as \frac{126}{245}-\frac{119}{245}, we now
calculate it as \frac{119}{245}-\frac{126}{245} (which equals
\frac{-7}{245}).
The average score on this problem was 53%.
Answer: Way 3: 126/245
The difference between way 1 and way 3 is that way 3 is now taking
results[0] as its test statistic instead of
results[0] - results[1], which represents the number of
Olympic gymnastics medals won by American gymnasts. This means that the
observed statistic should be the number of medals won by America in the
given sample. In that case, the observed statistics will be \frac{\text{# of American medals}}{\text{# of
American medals} + \text{# of Romanian medals}} = \frac{126}{245}
The average score on this problem was 52%.
Answer: Way 4: 119/245
Since now the sign is swapped from >= in way 3 to
<= in way 4, results[0] represent the
number of Romanian medals won. This is because the alternative
hypothesis states that America wins more medals than Romania,
demonstrating that the observed statistics is 119/245.
The average score on this problem was 50%.
The four p-values calculated in Ways 1 through 4 are:
exactly the same
similar, but not necessarily exactly the same
not necessarily similar
Answer: similar, but not necessarily the same
All of these differences in test statistics and different p-values all are different, however, they are all geared towards testing through the same null and alternative hypothesis. Although they are all different methods, they are all trying to prove the same conclusion.
The average score on this problem was 71%.