← return to practice.dsc10.com
This quiz was administered in-person. Students were allowed a cheat
sheet. Students had 20 minutes to work on the
quiz.
This quiz covered Lectures 13-17 of the Winter 2025 offering
of DSC 10.
A (a) provides insight into how much a (b) might vary across samples and is key for estimating the corresponding (c). This type of distribution is an example of a (d) because it is based on trials of an experiment, as opposed to a (e), which is based on theory.
The five terms that should fill in the blanks are parameter,statistic, probability distribution, empirical distribution, bootstrap distribution. Determine which word corresponds with each letter blank.
Answer (a): bootstrapped distribution
Answer (b): statistic
Answer (c): paramter
Answer (d): empirical distribution
Answer (e): probability distribution
The average score on this problem was 82%.
The DataFrame geisel
contains a row for each day of this
academic year. The "Students"
column contains the number of
students who visited Geisel Library on that day, as an
int
.
Fill in the blanks below such that
geisel_sample
is a simple random
sample of 50 rows of geisel
, and
the code prints the endpoints of a 95% bootstrapped confidence
interval for the mean number of students at Geisel
Library, based on the data in geisel_sample
.
= __(a)___
geisel_sample = np.array([]):
y for i in np.arange(10000):
= __(b)__
x = np.append(y, x)
y print(np.percentile(y, 2.5), np.percentile(y, 97.5))
Answer (a): geisel.sample(50)
or
geisel.sample(50, replace = False)
Answer (b):
geisel_sample.sample(50, replace = True).get("Students").mean()
or equivalent
The average score on this problem was 75%.
Which of the following is the best description of x
and
y
in the code above?
x
represents the original sample, and y
represents many resamples.
x
represents the average student count for a single day,
and y
represents the student count on all days.
x
represents the sample statistic from a single
resample, and y
represents the distribution of those
statistics across multiple resamples.
x
represents the sample statistic for the original
sample, and y
represents a distribution of statistics
across multiple resamples.
x
represents the population parameter, and
y
represents a bootstrapped distribution of sample
statistics.
Answer: Option 3: x
represents the
sample statistic from a single resample, and y
represents
the distribution of those statistics across multiple resamples.
The average score on this problem was 72%.
Select all true statements below.
If geisel_sample
had instead had 60 rows, the resulting 95\% confidence interval would have been
wider.
If we made 100 95\% confidence intervals based on
geisel_sample
, about 95 of
them would contain the population mean.
On about 95% of days, the number of students at Geisel Library falls between the endpoints of our confidence interval.
It would have also been appropriate to generate a confidence interval for this parameter using the Central Limit Theorem.
The standard deviation of geisel_sample
should be
approximately the same as the standard deviation of
geisel
The data in geisel_sample
is roughly normally
distributed.
Answer:
Option 4: It would have also been appropriate to generate a confidence interval for this parameter using the Central Limit Theorem.
Option 5: The standard deviation of geisel_sample
should
be approximately the same as the standard deviation of
geisel
The average score on this problem was 69%.
Your friend at SDSU records the number of students who visit their library each day, for 100 days. They tell you that the average is 6{,}000 and that the standard deviation is 500.
Without knowing anything about the distribution of your friend’s data, find the endpoints of the smallest interval which is guaranteed to contain at least 75\% of your friend’s data. Both endpoints should be given as integers.
Answer: [5000, 7000]
The average score on this problem was 64%.
If you then learn that your friend’s data is normally distributed, approximately what percentage of the data is actually contained in the interval you found above? Give your answer as an integer.
Answer: 95\%
The average score on this problem was 75%.