Winter 2025 Quiz 2

← return to practice.dsc10.com


This quiz was administered in-person. Students were allowed a cheat sheet. Students had 20 minutes to work on the quiz.

This quiz covered Lectures 13-17 of the Winter 2025 offering of DSC 10.


Problem 1

A (a) provides insight into how much a (b) might vary across samples and is key for estimating the corresponding (c). This type of distribution is an example of a (d) because it is based on trials of an experiment, as opposed to a (e), which is based on theory.

The five terms that should fill in the blanks are parameter,statistic, probability distribution, empirical distribution, bootstrap distribution. Determine which word corresponds with each letter blank.

Answer (a): bootstrapped distribution

Answer (b): statistic

Answer (c): paramter

Answer (d): empirical distribution

Answer (e): probability distribution


Difficulty: ⭐️⭐️

The average score on this problem was 82%.


Problem 2

The DataFrame geisel contains a row for each day of this academic year. The "Students" column contains the number of students who visited Geisel Library on that day, as an int.


Problem 2.1

Fill in the blanks below such that

    geisel_sample = __(a)___
    y = np.array([]): 
    for i in np.arange(10000): 
        x = __(b)__
        y = np.append(y, x)
    print(np.percentile(y, 2.5), np.percentile(y, 97.5))

Answer (a): geisel.sample(50) or geisel.sample(50, replace = False)

Answer (b): geisel_sample.sample(50, replace = True).get("Students").mean() or equivalent


Difficulty: ⭐️⭐️

The average score on this problem was 75%.


Problem 2.2

Which of the following is the best description of x and y in the code above?

Answer: Option 3: x represents the sample statistic from a single resample, and y represents the distribution of those statistics across multiple resamples.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 72%.


Problem 2.3

Select all true statements below.

Answer:

Option 4: It would have also been appropriate to generate a confidence interval for this parameter using the Central Limit Theorem.

Option 5: The standard deviation of geisel_sample should be approximately the same as the standard deviation of geisel


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 69%.



Problem 3

Your friend at SDSU records the number of students who visit their library each day, for 100 days. They tell you that the average is 6{,}000 and that the standard deviation is 500.


Problem 3.1

Without knowing anything about the distribution of your friend’s data, find the endpoints of the smallest interval which is guaranteed to contain at least 75\% of your friend’s data. Both endpoints should be given as integers.

Answer: [5000, 7000]


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 64%.



Problem 3.2

If you then learn that your friend’s data is normally distributed, approximately what percentage of the data is actually contained in the interval you found above? Give your answer as an integer.

Answer: 95\%


Difficulty: ⭐️⭐️

The average score on this problem was 75%.



👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.