← return to practice.dsc10.com
This quiz was administered in-person. It was closed-book and
closed-note; students were not allowed to use the DSC
10 Reference Sheet. Students had 20 minutes to work on
the quiz.
This quiz covered Lectures 13-18 of the Spring 2026 offering
of DSC 10.
Note (groupby / pandas 2.0): Pandas 2.0+ no longer
silently drops columns that can’t be aggregated after a
groupby, so code written for older pandas may behave
differently or raise errors. In these practice materials we use
.get() to select the column(s) we want after
.groupby(...).mean() (or other aggregations) so that our
solutions run on current pandas. On real exams you will not be penalized
for omitting .get() when the old behavior would have
produced the same answer.
Regal jumping spiders (Phidippus regius) are gaining popularity as pets, and are known for their ability to jump many times more than their body length. Sofia’s sister has a pet jumping spider named Elise, whose jumps we will study in this quiz.
The elise DataFrame contains 400 rows (i.e. recorded
jumps) and has the following columns:
"distance" (int): the horizontal distance,
in millimeters, of Elise’s jump"time" (str): the time of day during which
the jump occurred—either "morning",
"afternoon", or "night""humidity" (int): the percent humidity of
Elise’s enclosure, at the time of the jump
Jumping spiders prefer humidity to be between 50 and 70
percent. Suppose the "humidity" column has mean 60, and the
range [48, 72] is the smallest range
guaranteed to contain at least 75\% of
the values in "humidity". What is the standard deviation of
"humidity"?
Answer: 6
The average score on this problem was 55%.
Elise tends to be most active during the afternoon. Fill in the
blanks below to produce 10,000 bootstrapped estimates of Elise’s mean
jumping distance during the afternoon. Store these estimates in the
array aft_means.
aft = elise[__(a)__]
aft_means = np.array([])
for i in np.arange(10000):
resample = __(b)__
estimate = resample.get("distance").mean()
aft_means = __(c)__
(a): elise.get("time") == "afternoon"
(b):
aft.sample(aft.shape[0], replace=True)
(c): np.append(aft_means, estimate)
The average score on this problem was 75%.
Write code to compute the endpoints of a 70\% bootstrapped confidence interval for
Elise’s mean afternoon jumping distance, using the values in
aft_means.
left = __(d)__
right = __(e)__
(d): np.percentile(aft_means, 15)
(e): np.percentile(aft_means, 85)
The average score on this problem was 81%.
A jump is considered a long jump if it is at least 100 millimeters in
length. Sofia creates a new DataFrame called long,
containing only the rows in elise that correspond to long
jumps. Suppose the distribution of the "distance" column in
long is approximately normal, with mean 144 and variance
36.
Fill in the blank to add a new column to long called
"distance_su", containing the same information as the
"distance" column but converted to standard units.
standard_units = __(a)__
long = long.assign(distance_su = standard_units)
(a): (long.get("distance") - 144)/6
The average score on this problem was 51%.
Is "distance_su" approximately normally distributed?
Yes
No
Answer: Yes
The average score on this problem was 91%.
Assume we have run from scipy import stats. Using
stats.norm.cdf, write a Python expression that evaluates to
the approximate proportion of distances in long that are at
least 140 millimeters.
Answer: 1 - stats.norm.cdf(-2/3)
The average score on this problem was 36%.
The data in long can be thought of a sample from the
population of all long jumps that Elise takes. Suppose we use the data
in long to create a 95% CLT-based confidence interval for
the mean distance of all long jumps by Elise (a population parameter).
The resulting interval is wider than we’d like, so we decide to collect
a new sample that is just large enough to guarantee that the width of
the resulting 95% confidence interval is no more than \frac32, assuming the new sample has the same
standard deviation as the original. How large of a sample should we
take? Give your answer as an integer.
Answer: 256
The average score on this problem was 45%.