Spring 2025 Midterm Exam

← return to practice.dsc10.com


Instructor(s): Janine Tiefenbruck

This exam was administered in-person. Students were allowed one page of double-sided handwritten notes. No calculators were allowed. Students had 50 minutes to take this exam.


One month ago today, President Donald Trump announced his plan to impose so-called “reciprocal” tariffs on goods imported into the US from other countries. Tariffs are taxes charged by a government on imported goods, at a specific percentage of the value of the goods.

At a “Liberation Day” ceremony outside the White House, Trump held up a printed chart of countries, the tariffs they charge to the US, and the planned reciprocal tariff on goods from that country into the US. We have the data from that printed chart stored in the tariffs DataFrame, whose first few rows are shown below. tariffs contains 50 rows in total.

Note that the "Tariffs Charged to USA" and "Reciprocal Tariff" columns contain int values representing percentages. For example, a \$200 product from China would be subject to a 34\% tariff, which is a \$68 tax.

Assume that we have already run import babypandas as bpd and import numpy as np.


Problem 1


Problem 1.1

Which country has the highest "Reciprocal Tariff"? Write one line of code that evaluates to the name of this country.

Answer: tariffs.sort_values(by="Reciprocal Tariff", ascending=False).get("Country").iloc[0] or tariffs.set_index("Country").sort_values(by= Reciprocal Tariff", ascending=False).index[0]

To find the country with the highest “Reciprocal Tariff”, we need to first sort tariffs by the "Reciprocal Tariff" column in descending order: tariffs.sort_values(by="Reciprocal Tariff", ascending=False). After sorting in descending order, we know that the country with the highest reciprocal tariff will be in the first row. Then, to get the name of this country, we select the "Country" column using .get("Country"). This gives us a Series of countries in order of descending reciprocal tariffs, and we can access the first value in this Series by using .iloc[0]. Note that if we sort in ascending order instead, we can use .iloc[-1] to access the last value in the Series.

Alternatively, since each country appears only once in tariffs, we can start by setting the "Country" column as the index: tariffs.set_index("Country"). Then, we sort the resulting DataFrame by the "Reciprocal Tariff" column in descending order, as explained above. Since the country names are now in the index, the first row of this sorted DataFrame will have the country with the highest reciprocal tariff as its index value, which we can access by using .index[0]. Note that if we sort in ascending order instead, we can use .index[-1] to access the last value in the index.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 71%.


Problem 1.2

How many countries have a "Reciprocal Tariff" above 30\%? Write one line of code that evaluates to the number of such countries.

Answer: tariffs[tariffs.get("Reciprocal Tariff")>30].shape[0]


Difficulty: ⭐️⭐️

The average score on this problem was 89%.



Problem 2

In tariffs, we use integers to represent percentages, but we could also use strings with the percent symbol %. For example, the integer 34 and the string "34%" both represent the same thing.


Problem 2.1

Fill in the functions with_percent_symbol and without_percent_symbol below. The function with_percent_symbol should take as input an integer and give as output a string with the percent symbol. The function without_percent_symbol should do the opposite. Example behavior is given below.

    >>>  with_percent_symbol(34)             
    "34%"    
    >>> without_percent_symbol("34%")
    34                       

    def with_percent_symbol(x):         
        return __(a)__                        

    def without_percent_symbol(x):
        return __(b)__  

Answer (a): str(x)+"%"


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 70%.

Answer (b): int(x.replace("%", ""))


Difficulty: ⭐️⭐️

The average score on this problem was 79%.


Problem 2.2

Define the variable y as follows.

    y = tariffs.get("Reciprocal Tariff").apply(with_percent_symbol)

Below, define z so that it evaluates to exactly the same Series as y. You may not use with_percent_symbol or y when defining z.

Answer: z = tariffs.get("Reciprocal Tariff").apply(str) + "%"


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 33%.


Problem 2.3

Determine the value of the following expression.

    y.iloc[3] + " tax on goods from " + tariffs.get("Country").loc[3]

Answer: "32% tax on goods from Taiwan"


Difficulty: ⭐️⭐️

The average score on this problem was 88%.



Problem 3

Trump’s administration set the reciprocal tariffs based on tariffs charged to the USA.

For each country in tariffs, the value in the "Reciprocal Tariff" column is simply half of the value in the "Tariffs Charged to USA" column, rounded up to the next integer.

In addition, if the "Tariffs Charged to USA" is less than 20 percent, then the "Reciprocal Tariff" is set to 10 percent, so that no country’s reciprocal tariff is ever less than 10 percent.


Problem 3.1

Fill in the blanks in the function reciprocate which takes as input an integer representing the tariffs charged to the USA by a country, and returns an integer representing the reciprocal tariff that the US will impose on that country.

    def reciprocate(charged):
        half = int((charged + 1) / 2)
        if __(a)__:
            return __(b)__
        else:
            return __(c)__

Answer (a): charged < 20 or charged <= 20 or half < 10 or half <= 10

Answer (b): 10

Answer (c): half


Difficulty: ⭐️⭐️

The average score on this problem was 79%.


Problem 3.2

Fill in the return statement of the function reciprocate_2 which behaves the same as reciprocate but is implemented differently. You may not call the reciprocate function.

    def reciprocate_2(charged):
        return __(d)__

Answer (d): max(int((charged + 1)/2), 10)


Difficulty: ⭐️⭐️⭐️⭐️⭐️

The average score on this problem was 27%.


Problem 3.3

Define the variable ch as follows.

    ch = tariffs.get("Tariffs Charged to USA")

You want to check that reciprocate and reciprocate_2 give the same outputs on all inputs in ch. Write an expression that evaluates to True if this is the case, and False otherwise.

Answer: (ch.apply(reciprocate) == ch.apply(reciprocate_2)).sum() == 50


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 39%.



Problem 4

Below is a density histogram displaying the distribution of reciprocal tariffs for each of the 50 countries on Trump’s chart. It was plotted with the argument bins=[10, 11, 15, 20, 40, 50].

Note that while the European Union is actually a group of many countries, it is counted as one country here.


Problem 4.1

How many countries have a reciprocal tariff of 10%?

Answer: 21


Difficulty: ⭐️⭐️

The average score on this problem was 82%.


Problem 4.2

Suppose we plotted the same histogram, except we changed the bins argument to bins = [8, 15, 22, 30, 40, 50]. What would be the height of the leftmost bar in this histogram? Give your answer as a number to two decimal places.

Answer: 0.06


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 52%.


Problem 4.3

The European Union is not actually one country, but a group of 27 countries. Imagine we were to replace the row of tariffs corresponding to the European Union with 27 rows representing each of the member countries (all with a 20\% reciprocal tariff), then plot a histogram of the reciprocal tariffs using bins = [10, 11, 15, 20, 40, 50].

Let h_{\text{new}} be the height of the rightmost bar in this histogram, and let h_{\text{old}} be the height of the rightmost bar in the original histogram shown above. Express h_{\text{new}} in terms of h_{\text{old}}.

Answer: h_{\text{new}} = \frac{50}{76} \cdot h_{\text{old}}


Difficulty: ⭐️⭐️⭐️⭐️⭐️

The average score on this problem was 19%.



Problem 5

Suppose we have another DataFrame called trade_partners that has a row for every country that the United States trades with. trade_partners is indexed by "Country" and has two columns:

All countries in tariffs are included in trade_partners (including "European Union"), but not all countries in trade_partners are included in tariffs. The first three rows of trade_partners are shown below.


Problem 5.1

Write one line of code to merge tariffs with trade_partners and store the result in merged.

Answer: merged = tariffs.merge(trade_partners, left_on="Country", right_index=True)

tariffs and trade_partners are both dataframes which correspond to the US’s relationship with other Countries. Since both dataframes contain one row for each country we need to merge them with the column which corresponds to the country name. In tariffs that would be the Country column and in trade_partners that is the index.


Difficulty: ⭐️⭐️

The average score on this problem was 80%.


Problem 5.2

How many rows does merged have?

Answer: 50

Since each DataFrame has exactly one row per country, the merged result will also have one row for every country they share. And because every country in tariffs appears in trade_partners (though not vice versa), the merged DataFrame will contain exactly as many rows as there are countries in tariffs (which is 50).


Difficulty: ⭐️⭐️

The average score on this problem was 83%.


Problem 5.3

In which of the following DataFrames does the "Proportion" column sum to 1? Select all that apply.

Answer: trade_partners and trade_partners.groupby("Continent").sum()

Solving this problem is best done by working through each answer choice and eliminating the incorrect ones. In the problem statement, we are told that the Proportion column contains floats representing the proportion of US imports coming from each country. Since the Proportion column contains proportions, the sum of that column should equal one. Therefore, the first answer choice is a correct option. Moving on to the second choice, grouping by the continent and taking the mean proportion of each continent results in the proportion column containing mean proportions of groups. Since we are no longer working with all of the proportions and instead averages, we can not guarantee the sum of the Proportion column is one. However, because the third answer choice takes the sum of the proportions in each Continent, all of the proportions are still accounted for. As a result, the sum of the proportions column in the new dataframe would still add to one. Finally, as we determined in the previous part of the question, the merged dataframe contains all of the rows in tariffs, but not all of the rows in trade_partners. Per the problem description the rows in the Proportion column of trade_partners should sum to one, since some of those rows are omitted in merged, it is impossible for the Proportion column in merged to sum to one.


Difficulty: ⭐️⭐️

The average score on this problem was 88%.


Problem 5.4

Write one line of code that would produce an appropriate data visualization showing the median reciprocal tariff for each continent.

Answer: merged.groupby("Continent").median().plot(kind="barh", y="Reciprocal Tariff");

This question calls for a visualization which shows the median reciprocal tariff for each continent. The first part of solving this problem involves correctly identifying what dataframe to use when plotting the data. In this case, the problem asks for a link between Reciprocal Tariff, a column in the tariffs dataframe, and Continent, a column in the trade_partners dataframe. Therefore, the merged dataframe must be used to create the plot. Within the merged dataframe, the question calls for median reciprocal tariffs for each continent. Currently, the merged dataframe has one row for each country rather than continent. Thus, before plotting the data, the merged dataframe must be grouped by Continent and aggregated by the median() to get the median Reciprocal Tariff for each continent. From there, all that is left is plotting the data. Since there exists one categorical variable, Continent, and one numerical variable, Reciprocal Tariff, a bar chart is appropriate here. Finally, because the dataframe is already indexed by continent after the groupby statement, all that needs to be specified within the plot function is the y variable, in this case, Reciprocal Tariff.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 68%.



Problem 6

Most imported goods are transported to the US in shipping containers. The table to below shows the probability that a randomly selected shipping container comes from a given location (continent or country). Note that the probability for each continent is the sum of the probabilities for each country in that continent.

For all parts of this question, you can leave your answer as an unsimplified mathematical expression.


Problem 6.1

Suppose you randomly select one shipping container. You are told that it comes from Asia but not from Japan. What is the probability that it comes from China?

Answer: \frac{0.14}{0.36}=\frac{7}{18}


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 53%.


Problem 6.2

True or False: Selecting a shipping container from Vietnam and selecting a shipping container from Ireland are independent events.

Answer: False

If two events A and B are independent then P(A and B) = P(A)P(B). In this question we can define A as “selecting a shipping container from Vietnam” and B as “selecting a shipping container from Ireland”. Therefore (A and B) would be “selecting a (single) shipping container from both Vietnam and Ireland”. Since that is impossible, P(A and B) = 0 while P(A)P(B) = 0.012. Therefore these two events are not independent.


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 32%.


Problem 6.3

Suppose you randomly select two shipping containers, with replacement. What is the probability that exactly one comes from Germany and the other comes from somewhere in North America?

Answer: 0.03


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 46%.


Problem 6.4

Suppose you randomly select three shipping containers, with replacement. What is the probability that none of them comes from Asia?

Answer: 0.59^{3}


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 42%.



Problem 7

The announcement of the tariffs affected many products, one of which was the Nintendo Switch 2, a new video game console. Due to the tariffs, preorders of the Nintendo Switch 2 were put on hold so pricing could be reconsidered. In this problem, we’ll imagine a scenario in which Nintendo used this delay period to drum up excitement for their new product.

Suppose Nintendo arranges a contest to give away k of their new Switch 2 consoles. The contest is open to anyone and n people participate, with n > k. Everyone has an equal chance of winning, and nobody can win more than once. Jason and Ray both enter the contest, and they want to estimate the probability that they both win.


Problem 7.1

Fill in the blanks in the function giveaway so that it returns an estimate of the probability that Jason and Ray both win a Switch 2, when there are n participants and k prizes.

    1 def giveaway(n, k):
    2     count = 0
    3     for i in np.arange(10000):
    4         winners = np.random.choice(___(a)___)
    5         if ___(b)___:
    6             count = count + 1
    7     return ___(c)___

Answer (a): np.arange(n), k, replace=False

This makes sure that exactly k winners are chosen randomly from n participants without replacement, since no person can win more than once.

Answer (b): 0 in winners and 1 in winners (can be any two numbers)

Assuming Jason and Ray are represented by IDs 0 and 1, this checks whether both of them are in the list of winners for that trial. However, because we never specify what number Jason and Ray are, you could use any two numbers (ie: 1 in winners and 2 in winners)

Answer (c): count/10000

This computes the estimated probability as the fraction of trials where both Jason and Ray won out of 10,000 simulations.


Difficulty: ⭐️⭐️⭐️⭐️

The average score on this problem was 32%.



Problem 7.2

If you implement giveaway correctly, what should giveaway(100, 100) evaluate to?

Answer: 1.0

Since k is equal to n, everyone wins by default, meaning Jason and Ray will always be among the winners.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 55%.


Problem 7.3

Suppose you modify the giveaway function as follows:

Which of the following could be used to fill in blank (c)? Select all that apply.

Answer: len(results)/10000 and (results == "WIN!").mean()

  • len(results)/10000 calculates the proportion of trials that resulted in “WIN!”, which gives the probability.

  • (results == "WIN!").mean() works too because (results == "WIN!") gives a boolean array of True/False, and .mean() calculates the proportion of True values, which again gives the estimated probability.

  • (results == "WIN!").sum(): This counts the number of “WIN!” results but does not divide by 10000, so it gives a raw count, not a probability.

  • np.count_nonzero(results): This counts all non empty entries, but since results contains only “WIN!” strings, this is just len(results) and is equivalent to the raw count of wins, not the probability.

  • np.random.choice(results): This randomly picks an element from results. It is unrelated to calculating a probability and makes no sense in this context.


Difficulty: ⭐️⭐️⭐️

The average score on this problem was 55%.



👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.