# Winter 2024 Quiz 4

This quiz was administered in-person. It was closed-book and closed-note; students were not allowed to use the DSC 10 Reference Sheet. Students had 20 minutes to work on the quiz.

This quiz covered Lectures 13-17 of the Winter 2024 offering of DSC 10.

## Problem 1

Suppose we’ve imported the scipy module. To the nearest 0.5, what does the following expression evaluate to?

scipy.stats.norm.cdf(-2) * 100

Answer: 2.5

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 57%.

## Problem 2

Select all the true statements below.

• The average of the deviations from the mean is a meaningful measure of the spread of the data.

• It is possible for the standard deviation of a dataset to equal zero.

• It is possible for the standard deviation of a dataset to be negative.

• Given the standard deviation of a dataset, we can determine its mean.

• Given the standard deviation of a dataset, we can determine its variance.

Answer: Option 2 and Option 5

##### Difficulty: ⭐️⭐️

The average score on this problem was 80%.

## Problem 3

The Oscars, or Academy Awards, are the highest awards in the film industry, awarded each year to the best movies of that year. The oscars DataFrame contains a row for each movie that has ever been nominated for an Oscar. The "name" column contains the name of the movie and the "rating" column contains a rating of the movie on a 0 to 100 scale. This number incorporates many factors, but we won’t worry about how it is computed.

### Problem 3.1

Fill in the blanks below to collect a simple random sample of 400 movies from the oscars DataFrame, then calculate 10,000 bootstrapped sample mean ratings.

my_sample = __(x)__
n_resamples = 10000
boot_means = np.array([])
for i in np.arange(n_resamples):
resample = __(y)__
mean = __(z)__
boot_means = np.append(boot_means, mean)

Answer (x): oscars.sample(400)

##### Difficulty: ⭐️⭐️

The average score on this problem was 85%.

Answer (y): my_sample.sample(400, replace=True)

##### Difficulty: ⭐️⭐️

The average score on this problem was 87%.

Answer (z): resample.get("rating").mean()

##### Difficulty: ⭐️

The average score on this problem was 96%.

### Problem 3.2

In each blank, circle the word that correctly fills in the sentence.

A histogram of boot_means shows a(n) probability / empirical distribution of a statistic / parameter.

##### Difficulty: ⭐️⭐️

The average score on this problem was 77%.

### Problem 3.3

Suppose we use the array boot_means to calculate a 90% confidence interval for the mean rating of Oscar-nominated movies. Select all correct conclusions we can draw about this interval.

• There is a 90% chance that the true mean rating of all Oscar-nominated movies falls within this interval.

• The sample mean rating is within 90% of the true mean rating of all Oscar-nominated movies.

• If we looked at the ratings of many Oscar-nominated movies, about 90% of them would fall within this range.

• None of the above.

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 74%.

### Problem 3.4

Suppose both of the following expressions evaluate to True.

• my_sample.get("rating").mean() == 61.25

• np.std(my_sample.get("rating")) == 15

What are the left and right endpoints of a 95% CLT-based confidence interval for the mean rating of Oscar-nominated movies?

Answer: left endpoint: 59.75, right endpoint: 62.75

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 54%.