Winter 2025 Final Exam

← return to practice.dsc10.com


Instructor(s): Janine Tiefenbruck

This exam was administered in-person. Students were allowed one page of double-sided handwritten notes. No calculators were allowed. Students had 3 hours to take this exam.


Problem 1

While browsing the library, Hermione stumbles upon an old book containing game logs for all Quidditch matches played at Hogwarts in the 18th century. Quidditch is a sport played between two houses. It features three types of balls:

A game log is a list of actions that occurred during a Quidditch match. Each element of a game log is a two-letter string where the first letter represents the house that performed the action ("G" for Gryffindor, , "H" for Hufflepuff, "R" for Ravenclaw, "S" for Slytherin) and the second letter indicates the type of Quidditch ball used in the action ("Q" for Quaffle, "B" for Bludger, "S" for Snitch). For example, "RQ" in a game log represents Ravenclaw scoring with the Quaffle to earn 10 points.

Hermione writes a function, logwarts, to calculate the final score of a Quidditch match based on the actions in the game log. The inputs are a game log (a list, as described above) and the full names of the two houses competing. The output is a list of length 4 containing the names of the teams and their corresponding scores. Example behavior is given below.

    >>> logwarts(["RQ", "GQ", "RB", "GS"], "Gryffindor", "Ravenclaw")
    ["Gryffindor", 160, "Ravenclaw", 10]

    >>> logwarts(["HB", "HQ", "HQ", "SS"], "Hufflepuff", "Slytherin")
    ["Hufflepuff", 20, "Slytherin", 150]

Fill in the blanks in the logwarts function below. Note that some of your answers are used in more than one place in the code.

    def logwarts(game_log, team1, team2):
        score1 = __(a)__
        score2 = __(a)__

        for action in game_log:
            house = __(b)__
            ball = __(c)__
            
            if __(d)__: 
                __(e)__:
                    score1 = score1 + 10
                __(f)__:
                    score1 = score1 + 150
            else:
                __(e)__:
                    score2 = score2 + 10
                __(f)__:
                    score2 = score2 + 150
        return [team1, score1, team2, score2]


Problem 1.1

What goes in blank (a)?

Answer: 0

First inspect the function parameters. With the example logwarts(["RQ", "GQ", "RB", "GS"], "Gryffindor", "Ravenclaw"), we observe game_log will be a list of strings, and team1 and team2 will be the full name of the respective competing houses. We can infer from the given structure that our code will

  1. Initialize two scores variables for the two houses,

  2. Run a for loop through all the entries in the list, update score based on given conditions,

  3. Return the scores calculated by the loop.

To set up score_1 and score_2 so we can accumulate them in the for loop, we first set both equal to 0. So blank (a) will be 0.


Difficulty: ⭐️⭐️

The average score on this problem was 89%.


Problem 1.2

What goes in blank (b)?

Answer: action[0]

We observe the for loop iterates over the list of actions, where each action is a two letter string in the game_log array. Recall question statement- “Each element of a game log is a two-letter string where the first letter represents the house that performed the action”. Therefore, to get the house, we will want to get the first letter of each action string. This is accessed by action[0].

Note: A common mistake here is using action.split()[0]. Recall what split() does - it takes in a string, splits it according to the delimiter given (if not given, it separates by blank space), and returns a list of separated strings. This means that action.split()[0] will actually return itself. Example: if action is "RQ", action.split() will split "RQ" by blank space, in this case, there are none; so it will return the one-element list ["RQ"]. Accessing the zero-th index of this list by action.split()[0] gives us "RQ" back, instead of what we actually want ("R").


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 61%.


Problem 1.3

What goes in blank (c)?

Answer: action[1]

Following the same logic as blank (b), we get the type of ball used in the action by accessing the second character of the string, which is action[1].


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 62%.


Problem 1.4

What goes in blank (d)?

Answer: house == team1[0]

Now enter the fun part: to figure out the correct conditions of the if-statments, we must observe the code inside our conditional blocks carefully.

Recall question statement:

  • For each entry of the game log, the house is represented by the first letter. “G” for Griffindor, “H” for Hufflepuff, “R” for Ravenclaw, “S” for Slytherin.

  • Quaffle (“Q”) gets 10 points,

  • Snitch (“S”) gets 150 points,

  • Bludger (“B”) gets no point.

  • score1 is the score of the first team, and score2 is the score of the second.

We have two conditions to take care of, the house and the type of ball. How on earth do we know which one is nested and which one is on the outside? Observe in the first big if statment, we are only updating score1. This means this block takes care of the score of the first house. Therefore, blank (d) should set the condition for the first house.

Now careful! team1 and team2 are given as full house names. We can match the house variable by getting the first letter of the each team string (e.g. if team1 is Griffindor, we will get “G” to match with “G”). We want to match house with team1[0], so our final answer is house == team1[0]. Since there are only two houses here, the following else block will take care of calculating score for the second house using the same scoring scheme as we do for the first house.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 74%.


Problem 1.5

What goes in blank (e)?

Answer: if ball == "Q"

After gracefully handling the outer conditional statement, the rest is simple. We now simply condition the scores. Here, we see score1 increments by 10, so we know this is accounting for a Quaffle. Recall ball variable represents the type of ball we have. In this case, we use if ball == "Q" to filter for Quaffles.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 69%.


Problem 1.6

What goes in blank (f)?

Answer: elif ball == "S" or if ball == "S"

Using the same logic as blank (e), since the score is incremented by 150 here, we know this is a snitch. Using elif ball == "S" or if ball == "S" will serve the purpose. We do not need to worry about bludgers, since those do not add to the score.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 64%.



Problem 2

The Death Eaters are a powerful group of dark wizards who oppose Harry Potter and his allies. Each Death Eater receives a unique identification number based on their order of initiation, ranging from 1 to N, where N represents the total number of Death Eaters.

Your task is to estimate the value of N so you can understand how many enemies you face. You have a random sample of identification numbers in a DataFrame named death_eaters containing a single column called "ID".


Problem 2.1

Which of the options below would be an appropriate estimate for the total number of Death Eaters? Select all that apply.

Answer: death_eaters.get("ID").max() and int(death_eaters.get("ID").mean() * 2)

  • Option 1: death_eaters.get("ID").max() returns the maximum ID from the sample. This is an appropriate estimate since the population size must be at least the size of the largest ID in our sample. For instance, if the maximum ID observed is 250, then the total number of Death Eaters must be at least 250.

  • Option 2: death_eaters.get("ID").sum() returns the sum of all ID numbers in the sample. The total sum of IDs has no meaningful connection to the population size, which makes this an inappropriate estimate.

  • Option 3: death_eaters.groupby("ID").count() groups the data by ID and counts occurrences. Since each ID is unique and death_eaters only includes the "ID" column, grouping simply shows that each ID appears once. This is not an appropriate estimate for N.

  • Option 4: int(death_eaters.get("ID").mean() * 2) returns twice the mean of the sample IDs as an integer. The mean of a random sample of the numbers 1 through N usually falls about halfway between 1 and N. So we can appropriately estimate N by doubling this mean.

  • Option 5: death_eaters.shape[0] returns the number of rows in death_eaters (ie. the sample size). The sample size does not reflect the total population size, making it an inappropriate estimate.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 66%.


Problem 2.2

Each box that you selected in part (a) is an example of what?

Answer: a statistic

The options in part (a) calculate a numerical value from the random sample death_eaters. This fits the definition of a statistic.


Difficulty: ⭐️⭐️

The average score on this problem was 82%.


Problem 2.3

Suppose you have access to a function called estimate, which takes in a Series of Death Eater ID numbers and returns an estimate for N. Fill in the blanks below to do the following:

    boot_estimates = np.array([])
    
    for i in np.arange(10000):
        boot_estimates = np.append(boot_estimates, __(a)__)

    left_72 = __(b)__
    

What goes in blank (a)?

Answer: estimate(death_eaters.sample(death_eaters.shape[0], replace=True).get("ID"))

In the given code, we use a for loop to generate 10,000 bootstrapped estimates of N and append them to the array boot_estimates. Blank (a) specifically computes one bootstrapped estimate of N. Here’s how key parts of the solution work:

  • death_eaters.sample(death_eaters.shape[0], replace=True): To bootstrap, we need to resample the data with replacement. The sample() function (see here) takes as arguments the sample size (death_eaters.shape[0]) and whether to replace (replace=True).

  • .get("ID"): Since estimate() takes a Series as input, we need to extract the ID column from the resample.

  • estimate(): The resampled ID column is passed into the estimate() function to generate one bootstrapped estimate of N.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 62%.


Problem 2.4

What goes in blank (b)?

Answer: np.percentile(boot_estimates, 14)

A 72% confidence interval captures the middle 72% of our distribution. This leaves 28% of the data outside the interval, with 14% from the lower tail and 14% from the upper tail. Thus, the left endpoint corresponds to the 14th percentile of boot_estimates. The np.percentile() function (see here) takes as arguments the array to compute the percentile (boot_estimates) and the desired percentile (14).


Difficulty: ⭐️

The average score on this problem was 91%.



Problem 3

When new students arrive at Hogwarts, they get assigned to one of the four houses (Gryffindor, Hufflepuff, Ravenclaw, Slytherin) by a magical Sorting Hat.

Throughout this problem, we’ll assume that the Sorting Hat assigns students to houses uniformly at random, meaning that each student has an independent 25% chance of winding up in each of the four houses.

For all parts, give your answer as an unsimplified mathematical expression.


Problem 3.1

There are seven siblings in the Weasley family: Bill, Charlie, Percy, Fred, George, Ron, and Ginny. What is the probability that all seven of them are assigned to Gryffindor?



Problem 3.2

What is the probability that Fred and George Weasley are assigned to the same house?



Problem 3.3

What is the probability that none of the seven Weasley siblings are assigned to Slytherin?



Problem 3.4

Suppose you are told that none of the seven Weasley siblings is assigned to Slytherin. Based on this information, what is the probability that at least one of the siblings is assigned to Gryffindor?



Problem 4

Beneath Gringotts Wizarding Bank, enchanted mine carts transport wizards through a complex underground railway on the way to their bank vault.

During one section of the journey to Harry’s vault, the track follows the shape of a normal curve, with a peak at x = 50 and a standard deviation of 20.


Problem 4.1

A ferocious dragon, who lives under this section of the railway, is equally likely to be located anywhere within this region. What is the probability that the dragon is located in a position with x \leq 10 or x \geq 80? Select all that apply.

Answer: 1 - (scipy.stats.norm.cdf(1.5) - scipy.stats.norm.cdf(-2)) & scipy.stats.norm.cdf(-2) + scipy.stats.norm.cdf(-1.5)

  • Option 1: This code calculates the probability that a value lies outside the range between z = -2 and z = 1.5, which corresponds to x \leq 10 or x \geq 80. This is done by subtracting the area under the normal curve between -2 and 1.5 from 1. This is correct because it accurately captures the combined probability in the left and right tails of the distribution.

  • Option 2: This code multiplies the cumulative distribution function (CDF) at z = 1.75 by 2. This assumes symmetry around the mean and is used for intervals like |z| \geq 1.75, but that’s not what we want. The correct z-values for this problem are -2 and 1.5, so this option is incorrect.

  • Option 3: This code adds the probability of z \leq -2 and z \leq -1.5, which incorrectly includes overlapping parts of the left tail and completely ignores the right tail (z \geq 1.5). This option is incorrect.

  • Option 4: This is a static value with no basis in the z-scores of -2 and 1.5. It’s likely meant as a distractor and does not represent the correct probability for the specified conditions. This option is incorrect.

  • Option 5: This code adds the probability of z \leq -2 and z \geq 1.5, using the fact that P(z \geq 1.5) = P(z \leq -1.5) by symmetry. So, while the code appears to show both as left-tail calculations, it actually produces the correct total tail probability. This option is correct.


Problem 4.2

Harry wants to know where, in this section of the track, the cart’s height is changing the fastest. He knows from his earlier public school education that the height changes the fastest at the inflection points of a normal distribution. Where are the inflection points in this section of the track?

Answer: x = 30 and x = 70

Recall that the inflection points of a normal distribution are located one standard deviation away from the mean. In this problem, the mean is x = 50 and the standard deviation is 20, so the inflection points occur at x = 30 and x = 70. These are the points where the curve changes concavity and where the height is changing the fastest. Therefore, the correct answer is x = 30 and x = 70.


Problem 4.3

Next, consider a different region of the track, where the shape follows some arbitrary distribution with mean 130 and standard deviation 30. We don’t have any information about the shape of the distribution, so it is not necessarily normal.

What is the minimum proportion of area under this section of the track within the range 100 \leq x \leq 190?

Answer: 0.00

We are told that the distribution is not necessarily normal. The mean is 130 and the standard deviation is 30. We’re asked for the minimum proportion of area between x = 100 and x = 190.

Since the distribution isn’t normal and we don’t know its shape, we can’t use the empirical rule (68-95-99.7) or z-scores. We might try using Chebyshev’s Inequality, but that only works for intervals that are equally far below the mean as above the mean. This interval is not like that (it’s 1 standard deviation below the mean and 2 above), so Chebyshev’s Inequality doesn’t apply. The most we can say using Chebyshev’s Inequality is that in the interval from 1 standard deviation below the mean to 1 standard deviation above the mean, we can get at least 1 - \frac{1}{0^2} = 0 percent of the data. We can’t make any additional guarantees. So, the minimum possible proportion of area is 0.00.



Problem 5

Among Hogwarts students, Chocolate Frogs are a popular enchanted treat. Chocolate Frogs are individually packaged, and every Chocolate Frog comes with a collectible card of a famous wizard (ex.”Albus Dumbledore"). There are 80 unique cards, and each package contains one card selected uniformly at random from these 80.

Neville would love to get a complete collection with all 80 cards, and he wants to know how many Chocolate Frogs he should expect to buy to make this happen.

Suppose we have access to a function called frog_experiment that takes no inputs and simulates the act of buying Chocolate Frogs until a complete collection of cards is obtained. The function returns the number of Chocolate Frogs that were purchased. Fill in the blanks below to run 10,000 simulations and set avg_frog_count to the average number of Chocolate Frogs purchased across these experiments.

    frog_counts = np.array([])  
    
    for i in np.arange(10000):
        frog_counts = np.append(__(a)__)
    
    avg_frog_count = __(b)__


Problem 5.1

What goes in blank (a)?

Answer: frog_counts, frog_experiment()

Each call to frog_experiment() simulates purchasing Chocolate Frogs until a complete set of 80 unique cards is obtained, returning the total number of frogs purchased in that simulation. The result of each simulation is then appended to the frog_counts array.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 65%.


Problem 5.2

What goes in blank (b)?

Answer: frog_counts.mean()

After running the loop for 10000 times, the frog_counts array holds all the simulated totals. Taking the mean of that array (frog_counts.mean()) gives the average number of frogs needed to complete the set of 80 unique cards.


Difficulty: ⭐️⭐️

The average score on this problem was 89%.


Problem 5.3

Realistically, Neville can only afford to buy 300 Chocolate Frog cards. Using the simulated data in frog_counts, write a Python expression that evaluates to an approximation of the probability that Neville will be able to complete his collection.

Answer: np.count_nonzero(frog_counts <= 300) / len(frog_counts) or equivlent, such as np.count_nonzero(frog_counts <= 300) / 10000 or (frog_counts <= 300).mean()

In the simulated data, each entry of frog_counts is the number of Chocolate Frogs purchased in one simulation before collecting all 80 unique cards. We want to estimate the probability that Neville completes his collection with at most 300 cards.

frog_counts <= 300 creates a boolean array of the same length as frog_counts, where each element is True if the number of frogs used in that simulation was 300 or fewer, and False otherwise.

np.count_nonzero(frog_counts <= 300) counts how many simulations (out of all the simulations) met the condition since True evaluates to 1 and False evaluates to 0.

Dividing by the total number of simulations, len(frog_counts), converts that count to probability.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 59%.


Problem 5.4

True or False: The Central Limit Theorem states that the data in frog_counts is roughly normally distributed.

Answer: False

The Central Limit Theorem (CLT) says that the probability distribution of the sum or mean of a large random sample drawn with replacement will be roughly normal, regardless of the distribution of the population from which the sample is drawn.

The Central Limit Theorem (CLT) does not claim that individual observations are normally distributed. In this problem, each entry of frog_counts is a single observation: the number of frogs purchased in one simulation to complete the collection. There is no requirement that these individual data points themselves follow a normal distribution.

However, if we repeatedly take many samples of such observations and compute the sample mean, then that mean would tend toward a normal distribution as the sample size grows, follows the CLT.


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 38%.



Problem 6

Professor Severus Snape is rumored to display favoritism toward certain students. Specifically, some believe that he awards more house points to students from wizarding families (those with at least one wizarding parent) than students from muggle families (those without wizarding parents).

To investigate this claim, you will perform a permutation test with these hypotheses:

The DataFrame snape is indexed by "Student" and contains information on each student’s family background ("Family") and the number of house points awarded by Snape ("Points"). The first few rows of snape are shown below.


Problem 6.1

Which of the following is the most appropriate test statistic for our permutation test?

Answer: The mean number of house points awarded to students from wizarding families minus the mean number of house points awarded to students from muggle families.

Let’s look at each of the options:

  • Option 1: Incorrect. This does not work as a test statistic because the size of each group is not taken into account. For example, if there are more wizarding students than muggle students, their total could be higher just because they have more students.
  • Option 2: Correct. The difference in mean number of house points between the two groups is the correct choice because it accounts for group size and tells us which group is being awarded more points.
  • Option 3: Incorrect. This will not help us find out if one of the two groups receives favoritism, since the size of the group does not have anything to do with how many house points they are awarded.
  • Option 4: Incorrect. This does not work as a test statistic because the absolute difference does not tell us which group is being awarded more points. For example, an absolute difference of 1 could mean that either wizarding families or muggle families are being awarded one more point on average.

Difficulty: ⭐️⭐️

The average score on this problem was 83%.


Problem 6.2

Fill in the blanks in the function one_stat, which calculates one value of the test statistic you chose in part (a), based on the data in df, which will have columns called "Family" and "Points".

    def one_stat(df):
        grouped = df.groupby(__(a)__).__(b)__
        return __(c)__

Answer:

  • (a): `"Family"
  • (b): mean()
  • (c): grouped.get("Points").loc["Wizarding"] - grouped.get("Points").loc["Muggle"] or grouped.get("Points").iloc[1] - grouped.get("Points").iloc[0]

We first group by the "Family" column, which will create two groups, one for wizarding families and one for muggle families.

Using mean() as our aggregation function here will give us the mean of each of our two groups, allowing us to prepare for taking the difference in group means.

The grouped DataFrame will have two rows with "Wizarding" and "Muggle" as the index, and just one column "Points" which contains the mean of each group. We can either use .loc[] or .iloc[] to get each group mean, and then take the mean number of house points for wizarding families minus the mean number of house points for muggle families. Note that we cannot do this the other way around since our test statistic we chose in part (a) specifically mentions that order.


Difficulty: ⭐️⭐️

The average score on this problem was 81%.


Problem 6.3

Fill in the blanks in the function calculate_stats, which calculates 1000 simulated values of the test statistic you chose in part (a), under the assumptions of the null hypothesis. As before, df will have columns called "Family" and "Points".

    def calculate_stats(df)
        statistics = np.array([])

        for i in np.arange(1000):
            shuffled = df.assign(Points = __(d)__)
            stat = one_stat(__(e)__)
            statistics = __(f)__

        return statistics

Answer:

  • (d): np.random.permutation(df.get("Points"))
  • (e): shuffled
  • (f): np.append(statistics, stat)

Since we performing a permutation test, we need to shuffle the "Points" column to simulate the null hypothesis, that Snape awards points independently of family background. Note that shuffling either "Family" or "Points" would work, but in this case the code specifies that we are naming this shuffled column Points.

Next, we pass shuffled into our one_stat function from part (b) to calculate the test statistic for shuffled.

We then store our test statistic in the statistics array, which will have 1000 simulated test statistics under the null hypothesis once the for loop finishes running.


Difficulty: ⭐️⭐️

The average score on this problem was 85%.


Problem 6.4

Fill in the blanks to calculate the p-value of the permutation test, based on the data in snape.

    observed = __(g)__
    simulated = __(h)__
    p_value = (simulated __(i)__ observed).mean()

Answer:

  • (g): one_stat(snape) or snape.groupby("Family").mean().get("Points").loc["Wizarding"] - snape.groupby("Family").mean().get("Points").loc["Muggle"] or snape.groupby("Family").mean().get("Points").iloc[1] - snape.groupby("Family").mean().get("Points").iloc[0]
  • (h): calculate_stats(snape)
  • (i): >=

Our observed is going to be the test statistic we obtain from the initial observed data. We already created a function to calculate the test statistic — one_stat—, so we just need to apply it to our observed data —snape—, giving one_state(snape) in blank g. You can also manually calculate the observed test statistic by applying the logic in the formula giving either: snape.groupby("Family").mean().get("Points").loc["Wizarding"] - snape.groupby("Family").mean().get("Points").loc["Muggle"] or snape.groupby("Family").mean().get("Points").iloc[1] - snape.groupby("Family").mean().get("Points").iloc[0].

The simulated variable is simply the array of simulated test statistics. We already created a function to run the simulation called calculate_stats, so we just need to call it with our data giving calculate_stats(snape) in blank h.

Finally, the p-value is the probability of observing a result as extreme or even more extreme than our observed. The alternative hypothesis states that Snape awards more points to students from wizarding families, and the test statistic is the mean points of wizarding families minus the mean points from muggle families. Thus, in this case, a more extreme result, would be the simulated statistics being larger than the observed value. Thus, we use the >= operator in blank i to calculate the the number of simulated statistics that are equal to or greater than the observed value. Calculating the mean of this boolean series will output this proportion of seeing a result as extreme or even more extreme than the observed.


Difficulty: ⭐️⭐️

The average score on this problem was 80%.


Difficulty: ⭐️⭐️

The average score on this problem was 80%.


Difficulty: ⭐️⭐️

The average score on this problem was 81%.


Problem 6.5

Define mini_snape = snape.take(np.arange(3)) as shown below.

Determine the value of the following expression.

len(calculate_stats(mini_snape))

Answer: 1000

This problem is trying to find the length of the calculate_stats(snape) array. Looking at calculate_stats, we know that it calculates a simulated test statistic and appends it to the output array 1000 times as indicated in the for loop. Thus, the output of the function is an array of size 1000.


Difficulty: ⭐️⭐️

The average score on this problem was 78%.


Problem 6.6

With mini_snape defined as above, there will be at most three unique values in calculate_stats(mini_snape). What are those three values? Put the smallest value on the left and the largest on the right.

Answer: -5, -2, and 7

We are trying to find the three unique values outputted by calculate_stats(mini_snape). The function shuffles the 'Points' column, thus assigning new points to either the ‘Wizarding’ or ‘Muggle’ label. We know that there is one ‘Muggle’ label and two ‘Wizarding’ labels, so with each iteration, the ‘Wizarding’ label will have two items and the ‘Muggle’ label will only have one item. There are only three unique values that can be calculated from this because there are only three unique groups of two and one that can be made using the data. For example, if the data was A,B,C, then the only three unique groups of two and one we can create are AC and B, AB and C, and BC and A. Thus, we can now just run through all the combinations of groups to find our three unique values. We can do this by forming each scenario of groupings and calculating the test statistic by finding the mean number of points for ‘Wizarding’ and then subtracting by the value of ‘Muggle’ as dictated by the test statistic (Note: we do not need to take a mean for ‘Muggle’ since there is only one value assigned here):

Scenario 1:

  • Wizarding: 15, 7
  • Muggle: 13

\begin{align*} \frac{15+7}{2} &= 11 \\ 11 - 13 &= -2 \end{align*}

Scenario 2:

  • Wizarding: 13, 7
  • Muggle: 15 \begin{align*} \frac{13+7}{2} &= 10 \\ 10 - 15 &= -5 \end{align*}

Scenario 3:

  • Wizarding: 13, 15
  • Muggle: 7

\begin{align*} \frac{13+15}{2} &= 14 \\ 14 - 7 &= 7 \end{align*}

Thus, our answer from least to greatest is -5, -2, and 7.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 66%.



Problem 7

Professor Minerva McGonagall, head of Gryffindor, may also be awarding house points unfairly. For this question, we’ll assume that all four of the houses contain the same number of students, and we’ll investigate whether McGonagall awards points equally to all four houses.

Below is the distribution of points that Professor McGonagall awarded during the last academic year.

You want to test the following hypotheses:


Problem 7.1

Which of the following test statistics is appropriate for this hypothesis test? Select all that apply.

Answer: Option 2 and 5.

This problem is trying to find test statistics that can be used to distinguish when the data is better supported by the alternative. Since the alternative hypothesis simply states “is not uniform across all of the houses”, it is not crucial to look for individual differences between houses but rather the general relationship of points awarded to all houses.

Option 1: This option only tells us information about Gryffindor but doesn’t tell us anything about inequalities between other houses. For example,e if Gryffindor received 25% of the points and Slytherin received the other 75% of the points we would not be able to tell this apart from the case when all houses received 25% of the points.

Option 2: When the points are distributed normally we would expect that all the houses receive about the same amount and therefore the difference of points would be near 0. However, if one house is gaining more points than the rest of the houses then the number of points between the top and lowest house we be greater than 0. Therefore we are “measuring” the alternative.

Option 3: This test statistic measures the sum of the squared differences in proportions between McGonagall’s distribution and [0.5, 0.5, 0.5, 0.5]. However, [0.5, 0.5, 0.5, 0.5] does not represent a valid probability distribution because the total sum exceeds 1. As a result, this test statistic is not meaningful in assessing whether the point distribution is uniform across houses.

Option 4: When we measure the sum of the differences, the information on “how close” the data is to the alternative can be unseen. For example, a distribution of [0.25, 0.25, 0.25, 0.25] is what we’d expect under the null and a distribution of [0.0, 0.5, 0.0, 0.5] would support the alternative. However, using this test statistic we would see a result of 0 therefore not differentiating them.

Option 5: Under the null hypothesis [0.25, 0.25, 0.25, 0.25] is the “expected” distribution. Additionally, since the alternative hypothesis states that McGongall’s assignments are not uniform (aka is the distribution of points non-uniform), a house supports the alternative if its points are significantly less than or greater than 0.25. Squaring the differences allows the test statistic to be greater in either case and therefore would be a valid statistic.


Difficulty: ⭐️⭐️

The average score on this problem was 78%.


For the rest of this problem, we will use the following test statistic:

The sum of the absolute differences in proportions between McGonagall’s distribution and [0.25, 0.25, 0.25, 0.25].


Problem 7.2

Choose the correct way to implement the function calculate_test_stat, which takes in two distributions as arrays and returns the value of this test statistic.

    def calculate_test_stat(dist_1, dist_2):
        return _____

Answer: sum(np.abs(dist_1 - dist_2))

A valid test statistic in this problem would find how far dist_1 differs from dist_2. Since we’re looking at “how different” the distributions are, we need to take the absolute value (aka a measure of distance/difference) and then add them up. np.abs() needs to be used over abs() because dist_1 and dist_2 are arrays and the built-in function abs only works for individual numbers.


Difficulty: ⭐️⭐️

The average score on this problem was 84%.


Problem 7.3

(10 pts) Fill in the blanks in the code below so that simulated_ts is an array containing 10,000 simulated values of the test statistic under the null. Note that your answer to blank (c) is used in more than one place in the code.

    mc_gon = np.arange(__(a)__) # Careful: np.arange, not np.array!
    null = np.array([0.25, 0.25, 0.25, 0.25])
    observed_ts = calculate_test_stat(__(b)__)

    simulated_ts = np.array([])

    for i in np.arange(10000):
        sim = np.random.multinomial(__(c)__, __(d)__) / __(c)__
        one_simulated_ts = calculate_test_stat(__(e)__)
        simulated_ts = np.append(simulated_ts, one_simulated_ts)

What goes in blank (a)?

Answer: 0.1, 0.5, 0.1 (or equivalent)

Based on the distribution shown in the bar chart above, we want our resulting array to contain the proportions [0.4, 0.2, 0.3, 0.1] (40/100 for Gryffindor, 20/100 for Hufflepuff, 30/100 for Ravenclaw, and 10/100 for Slytherin). Note that the order of these proportions does not matter because: 1) we are calculating the absolute difference between each value and the null proportion (0.25), and 2) we will sum all the differences together.

Since these proportions are incrementally increasing, we can use np.arange() to construct mc_gon. There are multiple correct approaches to this problem, as long as the resulting array contains all four proportions. Some alternative correct approaches include: - np.arange(0.4, 0.0, -0.1) - np.arange(0.1, 0.41, 0.1) (The middle argument can be any value greater than 0.4 and less than or equal to 0.5)


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 57%.


Problem 7.4

What goes in blank (b)?

Answer: null, mc_gon

Note that the order of mc_gon and null does not matter, as calculate_test_stat calculates the absolute difference between the two.


Difficulty: ⭐️⭐️

The average score on this problem was 78%.


Problem 7.5

What goes in blank (c)?

Answer: 100

Blank (c) represents the total number of trials in each simulated sample. Using 100 ensures that each sample is large enough to approximate the expected proportions while maintaining computational efficiency. Additionally, (c) is used to divide all values, converting counts into proportions.


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 48%.


Problem 7.6

What goes in blank (d)?

Answer: null

Blank (d) is null because each simulated sample is generated under the null hypothesis. This means the probabilities used in np.random.multinomial should match the expected proportions from the null distribution.


Difficulty: ⭐️⭐️

The average score on this problem was 77%.


Problem 7.7

What goes in blank (e)?

Answer: sim, null

Note that the order of null and sim does not matter, as calculate_test_stat calculates the absolute difference between the two.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 67%.


Problem 7.8

Fill in the blank so that reject_null evaluates to True if we reject the null hypothesis at the 0.05 significance level, and False otherwise.

    reject_null = __(f)__

Answer: (simulated_ts >= observed_ts).mean() <= 0.05

reject_null should evaluate to a boolean statement therefore we must test whether our p-value is less than or equal to 0.05. Taking the mean of (simulated_ts >= observed_ts) tells us the proportion of simulated test statistics that are equal to the value that was observed in the data or is even further in the direction of the alternative.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 64%.


Problem 7.9

Your friend performs the same hypothesis test as you, but uses the total variation distance (TVD) as their test statistic instead of the one described in the problem. Which of the following statements is true?

Answer: Option 4.

TVD is calculated by taking the sum of the absolute differences of two proportions, all divided by 2. Therefore the only difference between TVD and our test statistic is the fact that TVD is divided by 2 (which would make it smaller). Meaning that the reason it is smaller is not related to magnitude or direction.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 63%.



Problem 8

Professor Filius Flitwick is conducting a study whose results will be used to help new Hogwarts students select appropriately sized broomsticks for their flying lessons. Professor Flitwick measures several wizards’ heights and broomstick lengths, both in centimeters. Since broomsticks can only be purchases in specific lengths, the scatterplot of broomstick length vs. height has a pattern of horizontal stripes:

If we group the wizards in Professor Flitwick’s research study by their broomstick length, and average the heights of the wizards in each group, we get the following results.

It turns out that the regression line that predicts broomstick length (y) based on wizard height (x) passes through the four points representing the means of each group. For example, the first row of the DataFrame above means that (165, 150) is a point on the regression line, as you can see in the scatterplot.


Problem 8.1

Based only on the fact that the regression line goes through these points, which of the following could represent the relationship between the standard deviation of broomstick length (y) and wizard height (x)? Select all that apply.

Answer: Options 4 and 5.

To solve this problem, we use the relationship between the slope of the regression line, the correlation coefficient r, and the standard deviations:

\text{slope} = r \cdot \frac{\text{SD}(y)}{\text{SD}(x)}

From the mean points given, we can calculate the slope:

\frac{225 - 150}{187.5 - 165.0} = \frac{75}{22.5} = \frac{10}{3}

We set up the equation:

r \cdot \frac{\text{SD}(y)}{\text{SD}(x)} = \frac{10}{3}

Now consider each option:

  • If \text{SD}(y) = \text{SD}(x): r = \frac{10}{3} \text{(not valid, since } r > 1\text{)}

  • If \text{SD}(y) = 2 \cdot \text{SD}(x): r \cdot 2 = \frac{10}{3} \Rightarrow r = \frac{5}{3} \approx 1.67 \quad \text{(not valid, since } r > 1\text{)}

  • If \text{SD}(y) = 3 \cdot \text{SD}(x): r \cdot 3 = \frac{10}{3} \Rightarrow r = \frac{10}{9} \approx 1.11 \quad \text{(not valid, since } r > 1\text{)}

  • If \text{SD}(y) = 4 \cdot \text{SD}(x): r \cdot 4 = \frac{10}{3} \Rightarrow r = \frac{10}{12} = \frac{5}{6} \approx 0.833 \quad \text{(valid)}

  • If \text{SD}(y) = 5 \cdot \text{SD}(x): r \cdot 5 = \frac{10}{3} \Rightarrow r = \frac{10}{15} = \frac{2}{3} \approx 0.667 \quad \text{(valid)}

Therefore, \text{SD}(y) = 4 \cdot \text{SD}(x) and \text{SD}(y) = 5 \cdot \text{SD}(x) are the only valid options.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 64%.


Problem 8.2

Now suppose you know that SD(y) = 3.5 \cdot SD(x). What is the correlation coefficient, r, between these variables? Give your answer as a simplified fraction.

Answer: \frac{20}{21}

We use the formula for slope:

\text{slope} = r \cdot \frac{\text{SD}(y)}{\text{SD}(x)}

From the mean points given, we can calculate the slope:

\frac{225 - 150}{187.5 - 165.0} = \frac{75}{22.5} = \frac{10}{3}

Since \text{SD}(y) = 3.5 \cdot \text{SD}(x), we plug this into the slope formula:

r \cdot 3.5 = \frac{10}{3}

Solving for r:

\begin{align*} r &= \frac{10}{3} \cdot \frac{1}{3.5} \\ &= \frac{10}{3} \cdot \frac{2}{7} \\ &= \frac{20}{21} \end{align*}


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 56%.


Problem 8.3

Suppose we convert all wizard heights from centimeters to inches (1 inch = 2.54 cm). Which of the following will change? Select all that apply.

Answer: Options 1, 4. and 5.

  • Standard deviation of wizard heights will change because heights are rescaled by a factor of 1/2.54.
  • Proportion within 3 standard deviations does NOT change since z-scores are unitless and the transformation is linear.
  • Correlation remains unchanged since it’s a unitless measure of linear relationship.
  • Slope predicting broom length from height will change because the x-variable (height) is rescaled while y remains the same.
  • Slope predicting height from broom length will change because the output is now in different units.

Difficulty: ⭐️⭐️

The average score on this problem was 80%.


Problem 8.4

Suppose we convert all wizard heights and all broomstick lengths from centimeters to inches (1 inch = 2.54 cm). Which of the following will change, as compared to the original data when both variables were measured in centimeters? Select all that apply.

Answer: None of the above

  • Correlation remains the same because it’s unitless.
  • Both slopes are unchanged because both SD(y) and SD(x) are rescaled by the same factor (1/2.54), so the ratio SD(y)/SD(x) stays the same. When both variables are converted by the same factor, their relative relationship remains unchanged.

Difficulty: ⭐️

The average score on this problem was 95%.


Problem 8.5

Professor Flitwick calculates the root mean square error (RMSE) for his regression line to be 36 cm. What does this RMSE value suggest about the accuracy of the regression line’s broomstick length predictions?

Answer: None of the above

RMSE is the square root of the average squared differences between predicted and actual values. None of the options accurately describes what RMSE represents because:

  • It’s not the average absolute error (which would be MAE, not RMSE)
  • It doesn’t mean every residual equals 36 cm
  • It doesn’t tell us that predictions vary by wizard height
  • It does provide information about prediction accuracy, but not in the ways described

RMSE gives us the typical size of the error in the same units as the response variable. It tells us that the typical prediction error is around 36 cm, but this is not the same as any of the given options.


Difficulty: ⭐️⭐️⭐️⭐️⭐️

The average score on this problem was 7%.


Problem 8.6

Which of the following plots is the residual plot for Professor Flitwick’s data?

Answer: Option B

A residual plot shows the difference between actual and predicted values plotted against the predictor variable (x).

Since broomsticks come in specific sizes (150, 175, 200, 225 cm), the residuals will form horizontal bands at each height level.

  • Option A shows diagonal bands that don’t match the horizontal pattern we expect from the broomstick lengths.
  • Option B shows diagonal bands of residuals that trend downward as height increases. This pattern matches what we would expect given the broomstick lengths and indicates some under/over-prediction at different height values.
  • Option C shows vertical columns of fairly random residuals, which doesn’t capture the relationship between residuals and height we would expect.

Difficulty: ⭐️⭐️⭐️

The average score on this problem was 59%.



Problem 9

The Triwizard Tournament is an international competition between three wizarding academies: Hogwarts, Durmstrang, and Beauxbatons.

In a Triwizard Tournament, wizards from each school compete in three dangerous magical challenges. If one school wins two or more challenges, that school is the tournament champion. Otherwise, there is no champion, since each school won a single challenge.

The DataFrame triwiz has a row for each challenge from the first 20 Triwizard Tournaments. With 20 tournaments each having 3 challenges, triwiz has exactly 60 rows. The first six rows are shown below.

The columns are:


Problem 9.1

(10 pts) Fill in the blanks below to create the DataFrame champions, which is indexed by "Winner" and has just one column, "Year", containing the number of years in which each school was the tournament champion. champions is shown in full below.

Note that the values in the "Year" column add up to 14, not 20. That means there were 6 years in which there was a tie (for example, 1299 was one such year).

    grouped = triwiz.groupby(__(a)__).__(b)__.__(c)__
    filtered = grouped[__(d)__] 
    champions = filtered.groupby(__(e)__).__(f)__.__(g)__

What goes in blank (a)?

Answer: ["Year", "Winner"] or ["Winner", "Year"]

Grouping by both the "Year" and "Winner" columns ensures that each school’s win in a given year is represented as a single row.


Difficulty: ⭐️⭐️

The average score on this problem was 89%.


Problem 9.2

What goes in blank (b)?

Answer: count()

Since each winner in a given year appears as a single row, we use count() to determine how many times each school won that year.


Difficulty: ⭐️

The average score on this problem was 90%.


Problem 9.3

What goes in blank (c)?

Answer: reset_index()

Grouping by multiple columns creates a multi-index. reset_index() flattens the DataFrame back to normal, where each row represents a given year for each winning school.


Difficulty: ⭐️⭐️

The average score on this problem was 80%.


Problem 9.4

What goes in blank (d)?

Answer: grouped.get("Host") != 1 or grouped.get("Host") > 1 or grouped.get("Host") >= 2

The Triwizard Tournament winner is defined as a school that wins two or more challenges in a given year. After grouping with .count(), all other columns contain the same value, which is the number of challenges each winning school won that year. A school will not appear in this DataFrame if they did not win any challenges that year, so we only need to check if the value in the other columns is not 1.
.get("Challenge") is also valid because all other columns contain the same value.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 71%.


Problem 9.5

What goes in blank (e)?

Answer: "Winner"

The resulting DataFrame should be indexed by "Winner", therefore the DataFrame is grouped by the "Winner" column.


Difficulty: ⭐️

The average score on this problem was 90%.


Problem 9.6

What goes in blank (f)?

Answer: count()

Grouping with .count() again ensures that the resulting columns represent the number of times each "Winner" (school) in the index won across all years.


Difficulty: ⭐️⭐️

The average score on this problem was 84%.


Problem 9.7

What goes in blank (g)?

Answer: get(["Year"])

The question asks for a DataFrame with "Year" as the only column, so brackets are used around "Year" to ensure the output is a DataFrame rather than a Series.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 69%.


Problem 9.8

How many rows are in the DataFrame that results from merging triwiz with itself on "Year"? Give your answer as an integer.

Answer: 180

The most important part of this question is understanding how merge works in babypandas. Start by reviewing this diagram from lecture.

When we merge two DataFrames together by "Year", we are matching every row in triwiz with every other row that has the same value in the "Year" column. This means that for each year, we’ll match all the rows from that year with each other. Since there are three challenges per year, that means that each year appears 3 times in the DataFrame. Since we are matching all rows from each year with each other, this means we will end up with 3 * 3 or 9 rows per year. Since there are 20 years in the DataFrame, we can multiply these together to get 180 total rows in the merged DataFrame.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 69%.


Problem 9.9

How many rows are in the DataFrame that results from merging triwiz with itself on "Challenge"? Give your answer as an integer.

Answer: 1200

Similar to the previous part, we are now matching all rows from a given challenge to each other. There are 3 challenges per tournament, so the values in the "Challenge" column are 1, 2, and 3. Each such values appears 20 times, once for each year. As a result, for each of the 3 challenges there are 20 * 20 or 400 rows. Therefore, we have 400 * 3 = 1200 rows total.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 59%.


Problem 9.10

How many rows are in the DataFrame that results from merging triwiz with itself on "Host"? Select the expression that evaluates to this number.

Answer: 2 * 21^2 + 18^2

The key to understanding how this problem works is by understanding how many times each school ends up hosting the tournament within this dataset. It is stated that the host is determined on a rotating basis. Based on the DataFrame description, we know the order is Hogwarts, Durmstrang, and then Beauxbatons. Since there are only 20 years in this dataset, the last school in the rotation will have one less host than the other two schools. Thus, we have determined that Hogwarts hosts 7 times, Durmstrang hosts 7 times, and Beauxbatons hosts 6 times. Since for each year a school hosts, they appear three times in the DataFrame, each school appears 21 times, 21 times, and 18 times respectively. As stated in the above questions when merging we are matching all rows from a given host to each other. Therefore, the total rows can be expressed as 21^2 + 21^2 + 18^2. This matches the last answer choice.


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 43%.



Problem 10

Bertie Bott’s Every Flavor Beans are a popular treat in the wizarding world. They are jellybean candies sold in boxes of 100 beans, containing a variety of flavors including chocolate, peppermint, spinach, liver, grass, earwax, and paper. Luna’s favorite flavor is bacon.

Luna wants to estimate the proportion of bacon-flavored beans produced at the Bertie Bott’s bean factory. She buys a box of Bertie Bott’s Every Flavor Beans and finds that 4 of the 100 beans inside are bacon-flavored. Using this sample, she decides to construct an 86\% CLT-based confidence interval for the proportion of bacon-flavored beans produced at the factory.


Problem 10.1

Let’s begin by solving a related problem that will help us in the later parts of this question. Consider the following fact:

For a sample of size 100 consisting of 0’s and 1’s, the maximum possible width of an 86\% CLT-based confidence interval is approximately 0.15.

Use this fact to find the value of z such that scipy.stats.norm.cdf(z) evaluates to 0.07. Give your answer as a number to one decimal place.

Answer: -1.5

The 86% confidence interval for the population mean is given by:

\left[ \text{sample mean} - |z| \cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}}, \ \text{sample mean} + |z| \cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}} \right]

Since the width is equal to the difference between the right and left endpoints,

\text{width} = 2 \cdot |z| \cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}}

We solve for |z|. The maximum width of our CI is given to be 0.15, so we must also use the maximum possible standard deviation, 0.5. we substitute the known values to obtain:

0.15 = 2 \cdot |z| \cdot \frac{0.5}{\sqrt{100}}

which leaves |z| = 1.5 after computation. To find the z such that scipy.stats.norm.cdf(z) evaluates to 0.07, we realize that z is the point under the normal curve, in standard units, left of which represents 7\% of the area under the entire curve. Note that scipy.stats.norm.cdf(0) evaluates to 0.5 (Recall: half of the area is left of the mean, which is zero in standard units). We must therefore take a negative value for z. Thus z = -1.5.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 55%.



Problem 10.2

Suppose that Luna’s sample has a standard deviation of 0.2. What are the endpoints of her 86\% confidence interval? Give each endpoint as a number to two decimal places.

Answer: [0.01, \ 0.07]

Recall the formula for the width of an 86\% confidence interval:

\text{width} = 2 \cdot |z| \cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}}

where we found |z| = 1.5 in part (a). Instead of using the maximum sample SD, we will now use 0.2 and compute the new width of the confidence interval. This results in

\text{width} = 2 \cdot 1.5 \cdot \frac{0.2}{\sqrt{10}} = 0.06

Since this is a CLT-based confidence interval for the population mean, the interval must be centered at the mean. We compute the interval using the structure from part (a), which leaves

\left[ 0.04 - \frac{1}{2} \cdot 0.06, \ 0.04 + \frac{1}{2} \cdot 0.06 \right] = [0.01, \ 0.07]


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 37%.



Problem 10.3

Hermione thinks she can do a better job of estimating the proportion of bacon-flavored beans, though she’ll need a bigger sample to do so. Hermione will collect a new sample and use it to construct another 86\% confidence interval for the same parameter.

Under the assumption that Hermione’s sample will have the same standard deviation as Luna’s sample, which was 0.2, how many boxes of Bertie Bott’s Every Flavor Beans must Hermione buy to guarantee that the width of her 86\% confidence interval is at most 0.012? Give your answer as an integer.

Remember: There are 100 beans in each box.

Answer: 25 boxes

Recall the formula for the width of an 86\% confidence interval:

\text{width} = 2 \cdot |z| \cdot \frac{\text{sample SD}}{\sqrt{\text{sample size}}}

where we must again use the fact that |z| = 1.5 from part (a). Here, we want a width that is no larger than 0.012, given that our sample SD remains 0.2. Plugging everything in:

0.012 \geq 2 \cdot 1.5 \cdot \frac{0.2}{\sqrt{n}}

Rearranging the expression to solve for n, we get

\begin{align*} n &\geq \left( \frac{3 \cdot 0.2}{0.012} \right)^2 \\ n &\geq \left( \frac{600}{12} \right)^2 \\ n &\geq (50)^2 \\ n &\geq 2500 \end{align*}

However, 2500 isn’t our final answer. The question asks for the number of boxes Hermione must buy, given that each box contains 100 beans. The bound we computed above for n corresponds to the minimum number of beans Hermione must observe. To get the minimum number of boxes, we simply divide the bound by 100. The final answer is 25 boxes.


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 37%.



👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.