# Winter 2024 Quiz 5

This quiz was administered in-person. It was closed-book and closed-note; students were not allowed to use the DSC 10 Reference Sheet. Students had 20 minutes to work on the quiz.

This quiz covered Lectures 18-21 of the Winter 2024 offering of DSC 10.

## Problem 1

### Problem 1.1

Suppose you want to estimate the proportion of UCSD students that prefer sunny days over rainy days. You plan to survey 900 students, then construct a 95% confidence interval for this proportion. What is the widest possible width for the resulting confidence interval? Give your answer as a fully simplified fraction.

##### Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 38%.

### Problem 1.2

If you decide to survey 450 students instead of 900 students for your sample, the maximum possible width of your 95% confidence interval would:

• double

• increase by more than double

• increase by less than double

Answer: increase by less than double

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was ______ 60%.

## Problem 2

Bill claims that San Diego is sunny 60% of the year, but you think the percentage is higher. You decide to test the validity of Bill’s claim by running a hypothesis test.

### Problem 2.1

Both your null and alternative hypotheses will take this form:

The percentage of sunny days in San Diego is 60%.

(i) For the null hypothesis, how should we fill in the blank?

• equal to

• not equal to

• greater than

• less than

(ii) For the alternative hypothesis, how should we fill in the blank?

• equal to

• not equal to

• greater than

• less than

##### Difficulty: ⭐️⭐️

The average score on this problem was 89%.

##### Difficulty: ⭐️⭐️

The average score on this problem was 75%.

### Problem 2.2

Say that in 2023, San Diego had 235 sunny days. Using the number of sunny days per year as the test statistic, fill in the following code to run the hypothesis test and store the p-value of your test in p_val. We’ll assume that all years have 365 days.

    observed_stat = ___(w)___
results = np.array([])

for i in np.arange(10000):
result = np.___(x)___(365, ___(y)___)[0]
results = np.append(results, result)

p_val = np.count_nonzero(results __(z)__ observed_stat) / 10000

Answer (w): 235

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 58%.

Answer (x): random.multinomial

##### Difficulty: ⭐️

The average score on this problem was 93%.

Answer (y): [0.6, 0.4]

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 65%.

Answer (z): >=

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 67%.

### Problem 2.3

Suppose the p-value of this hypothesis test is 0.1. Which of the following statements is a correct interpretation of this p-value?

• If the null hypothesis is true, there is a 10% chance of obtaining a test statistic equal to the observed statistic.

• If the null hypothesis is true, there is a 90% chance of obtaining a test statistic equal to the observed statistic.

• If the null hypothesis is true, there is a 10% chance of obtaining a test statistic equal to, or at least as extreme as, the observed statistic.

• If the alternative hypothesis is true, there is a 90% chance of obtaining a test statistic equal to, or at least as extreme as, the observed statistic.

Answer: If the null hypothesis is true, there is a 10% chance of obtaining a test statistic equal to, or at least as extreme as, the observed statistic.

##### Difficulty: ⭐️⭐️

The average score on this problem was 76%.

## Problem 3

Charlie spends 100 days in each of three California cities (San Diego, San Francisco, and Los Angeles). He records the primary weather for each day in each location. The table below shows these values as proportions. For example, 43 of the 100 days that Charlie spent in San Diego had primarily sunny weather.

Weather San Diego San Francisco Los Angeles
sunny 0.43 0.51 0.34
partly cloudy 0.24 0.18 0.22
cloudy 0.22 0.15 0.26
rainy 0.11 0.16 0.18

### Problem 3.1

What is the total variation distance between the San Diego and San Francisco weather distributions? Give your answer as an exact decimal.

##### Difficulty: ⭐️⭐️⭐️

The average score on this problem was 74%.

### Problem 3.2

Charlie reconsiders his weather categories and decides to count any “partly cloudy” days as “cloudy” days, thereby combining these two categories into one. If he recalculates the TVD between any pair of cities, what will he find?

• The new TVD is less than or equal to the old TVD.

• The new TVD is greater than or equal to the old TVD.

• The new TVD is equal to the old TVD.

• There is not enough information to determine the relationship

Answer: The new TVD is less than or equal to the old TVD.

##### Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 39%.