Spring 2025 Quiz 3

← return to practice.dsc10.com


This quiz was administered in-person. It was closed-book; students were not allowed to use the DSC 10 Reference Sheet. Students had 20 minutes to work on the quiz.

This quiz covered Lectures 13-16 of the Spring 2025 offering of DSC 10.


Problem 1

The DataFrame concerts contains data on a sample of concerts held in 2024. For each concert, we have the name of the performing "artist", the "date" and "location" of the concert, and the total "attendance" as an integer. The first few rows of concerts are shown below:


Problem 1.1

You are interested in estimating the average "attendance" at concerts using the data in concerts. Fill in the blanks to define a function estimate_attendance that takes as input a number of estimates to produce, and returns an array with that number of bootstrapped estimates of the mean concert attendance.

def estimate_attendance(how_many):
    estimates = np.array([])
    for i in __(a)__:
        resample = ___(b)__
        estimates = np.append(estimates, __(c)__.mean())
    return estimates

(a): np.arange(how_many)
(b): concerts.sample(concerts.shape[0], replace=True)
(c): resample.get("attendance")


Difficulty: ⭐️⭐️

The average score on this problem was 82%.


Problem 1.2

Now, fill in the blanks to compute an 85\% confidence interval for the average concert attendance based on 10000 bootstrapped estimates.

boot_attendance = estimate_attendance(__(a)__)
ci_low = np.percentile(boot_attendance, __(b)__)
ci_high = np.percentile(boot_attendance, __(c)__)
concert_interval = [ci_low, ci_high]

(a): 10000
(b): 7.5
(c): 92.5


Difficulty: ⭐️

The average score on this problem was 90%.


Problem 1.3

Suppose concert_interval comes out to [18500, 19500]. Which of the following statements are valid interpretations of this interval? Select all that apply.

Answer: Option 3 only.


Difficulty: ⭐️⭐️

The average score on this problem was 84%.


Problem 1.4

You are told that the data in the "attendance" column of concerts has a mean of 19000 and a standard deviation of 3000. Find the endpoints of the smallest interval which is guaranteed to contain at least \frac{15}{16} of the data. Both endpoints should be given as integers.

Answer: [7000, 31000]


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 74%.


Problem 1.5

You are now told that the data in the "attendance" column is normally distributed. Approximately what percentage of the data is included in the interval you gave above? Give your answer to the nearest integer.

Answer: 100


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 43%.



Problem 2

After this year’s Sun God Festival, the UCSD administration wants to estimate how much students would be willing to pay for a ticket to future Sun God Festivals. They (somehow) take 500 simple random samples of 100 students each, asking them this question. They then plot a histogram showing the distribution of the mean response from each sample.


Problem 2.1

Which of the following statements are true? Select all that apply.

Answer: Options 1 and 3


Difficulty: ⭐️⭐️

The average score on this problem was 88%.



Problem 2.2

Describe in one word how the histogram would be different if it were instead based on 500 simple random samples of 1000 students each.

Answer: narrower


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 70%.



👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.