Fall 2024 Quiz 4

← return to practice.dsc10.com

This quiz was administered in-person. It was closed-book and closed-note; students were not allowed to use the DSC 10 Reference Sheet. Students had 20 minutes to work on the quiz.

This quiz covered Lectures 13, 15-18 of the Fall 2024 offering of DSC 10.

Problem 1

We plan to collect a sample of movies and use this sample to estimate the proportion of all movies with the genre "Musical", a population parameter.

Problem 1.1

If we want to create a 95\% confidence interval that is at most 0.08 wide, which of the expressions below represents the smallest sample size we should collect?

\left(\dfrac{1}{0.02}\right)^2
\left(\dfrac{1}{0.04}\right)^2
\left(\dfrac{1}{0.08}\right)^2
\left(\dfrac{1}{0.16}\right)^2

Answer: \left(\dfrac{1}{0.04}\right)^2

Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 41%.

Problem 1.2

Let W represent the maximum width of a 95\% confidence interval obtained from a sample that is twice as big as the sample size you found in part (a). Which of the following is true?

0 < W < 0.04
W = 0.04
0.04 < W < 0.08
W \geq 0.08

Answer: 0.04 < W < 0.08

Difficulty: ⭐️⭐️⭐️

The average score on this problem was 56%.

Problem 2

Consider the following pair of hypotheses:

Null: The proportion of "Comedy" movies with an average rating above 9 equals the proportion of "Action" movies with a rating above 9.
Alternative: The proportion of "Comedy" movies with an average rating above 9 is greater than the proportion of "Action" movies with a rating above 9.

Let C_9 be the proportion of "Comedy" movies with a rating above 9 and let A_9 be the proportion of "Action" movies with a rating above 9. Which of the following are valid test statistics to test these hypotheses? Select all that apply.

Answer: C_9 - A_9 and A_9 - C_9

Difficulty: ⭐️⭐️

The average score on this problem was 82%.

Problem 3

The table below shows how many movies in each of four genres Jack and Eric have seen. What is the total variation distance (TVD) between Jack and Eric’s distribution of movies by genre?

	`"Musical"`	`"Comedy"`	`"Action"`	`"Horror"`	Total
Jack	2	14	2	2	20
Eric	55	15	15	15	100

Answer: 0.55

Difficulty: ⭐️⭐️⭐️⭐️⭐️

The average score on this problem was 25%.

Problem 4

Suppose that in movies, the average rating of "Action" movies is 7.4 and the average rating of "Horror" movies is 7.2. Based on this data, we decide to test the following hypotheses:

Null: The ratings of "Action" and "Horror" movies come from the same distribution.
Alternative: On average, "Action" movies have a higher rating "Horror" movies.

We’ll use as our test statistic the mean rating of "Action" movies minus the mean rating of "Horror" movies.

Problem 4.1

Fill in the blanks so the code below generates 5000 simulated values of this test statistic and calculates the p-value of our test.

def one_stat(df):
    group_means = df.groupby("New").mean().get("Rating")
    return group_means.loc[__(a)__] - group_mean.loc[__(b)__]

action_horror = movies[(movies.get("Genre") == "Action") | 
                       (movies.get("Genre") == "Horror")]
diffs = np.array([])
for i in np.arange(5000):
    new_df = action_horror.assign(New = __(c)__)
    diffs = np.append(diffs, __(d)__)

p_value = np.count_nonzero( __(e)__ ) / 5000

(a): "Action"
(b): "Horror"
(c): np.random.permutation(action horror.get("Genre"))
(d): one stat(new df)
(e): diffs >= 0.2

Difficulty: ⭐️⭐️⭐️

The average score on this problem was 74%.

Problem 4.2

Suppose that p_value evaluates to 0.14. Using the standard p-value cutoff of 0.05, which of the two hypotheses is better supported by the data?

Null
Alternative

Answer: Null

Difficulty: ⭐️⭐️

The average score on this problem was 84%.

Problem 4.3

What kind of hypothesis test did we perform in this question?

Standard hypothesis test
Permutation test

Answer: Permutation test

Difficulty: ⭐️⭐️

The average score on this problem was 84%.

Problem 1

Problem 1.1

Click to view the solution.

Difficulty: ⭐️⭐️⭐️⭐️

Problem 1.2

Click to view the solution.

Difficulty: ⭐️⭐️⭐️

Problem 2

Click to view the solution.

Difficulty: ⭐️⭐️

Problem 3

Click to view the solution.

Difficulty: ⭐️⭐️⭐️⭐️⭐️

Problem 4

Problem 4.1

Click to view the solution.

Difficulty: ⭐️⭐️⭐️

Problem 4.2

Click to view the solution.

Difficulty: ⭐️⭐️

Problem 4.3

Click to view the solution.

Difficulty: ⭐️⭐️

👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.