Spring 2026 Quiz 4

← return to practice.dsc10.com


This quiz was administered in-person. It was closed-book and closed-note; students were not allowed to use the DSC 10 Reference Sheet. Students had 20 minutes to work on the quiz.

This quiz covered Lectures 18-21 of the Spring 2026 offering of DSC 10.


Note (groupby / pandas 2.0): Pandas 2.0+ no longer silently drops columns that can’t be aggregated after a groupby, so code written for older pandas may behave differently or raise errors. In these practice materials we use .get() to select the column(s) we want after .groupby(...).mean() (or other aggregations) so that our solutions run on current pandas. On real exams you will not be penalized for omitting .get() when the old behavior would have produced the same answer.


Comic-Con is a massive pop culture and entertainment convention held each summer at the San Diego Convention Center. UC San Diego is capitalizing on its proximity to this major event and is offering Comic Con weekend accommodations in empty dorm rooms.

UCSD housing coordinators Pranav and Ray each processed a random sample of 25 Comic Con housing reservations. The res DataFrame has 50 rows, each of which represents a reservation processed by Pranav or Ray. Below are column descriptions and a preview of the first few rows of the DataFrame.


Problem 1


Problem 1.1

::: tabular @p0.5@p0.4@ The number of reservations for each of the "dates" processed by each coordinator are given in the table to the right. &

"Pranav" "Ray"
"Jul22-26" 20 18
"Jul23-27" 5 7


Compute the total variation distance (TVD) between Pranav’s distribution of "dates" and Ray’s distribution of "dates". Give your answer as an exact decimal or simplified fraction.

Answer: 0.08


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 37%.



Problem 1.2

@p0.5@p0.4@ The table to the right shows how many reservations of each "package" type were processed by each coordinator. &

"Pranav" "Ray"
"early_bird" p r
"standard" 25 - p 25 - r


Which of the following expressions correctly computes the TVD between Pranav’s distribution of "package" type and Ray’s distribution of "package" type?

Answer: Option B


Difficulty: ⭐️⭐️⭐️⭐️⭐️

The average score on this problem was 14%.



Problem 2

Pranav is doing a hypothesis test with the following null hypothesis: in the population of reservations, each reservation has a 30\% chance of being "early_bird". His test statistic is the percentage of "early_bird" reservations, and his observed data is all of the data in res. Write one line of code to calculate one simulated value of the test statistic under the assumptions of the null. Your code should produce an integer between 0 and 100.

Answer: np.random.multinomial(50, [0.3, 0.7])[0] / 50 * 100


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 45%.


Problem 3

The housing director, Ella, wants to know whether Ray’s reservations have significantly higher prices than Pranav’s. Ella has access to res but no knowledge of the population distribution of reservation prices.


Problem 3.1

Which pair of hypotheses should be used for Ella’s test? Select the best answer.

Answer: Option 3


Difficulty: ⭐️⭐️

The average score on this problem was 84%.


Problem 3.2

Which of the following approaches are appropriate for Ella’s goal? Select all that apply.

Answer: Options 1 and 4.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 54%.


Problem 3.3

Suppose Ella performs a test where the test statistic is Pranav’s mean price minus Ray’s mean price. The observed value of the test statistic is -50.7. Which simulated statistics should be counted when computing the p-value?

Answer: Option 2


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 59%.



👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.