Winter 2026 Midterm Exam

← return to practice.dsc10.com


Instructor(s): Peter Chi, Sam Lau

This exam was administered in-person. Students were allowed one page of double-sided handwritten notes. No calculators were allowed. Students had 50 minutes to take this exam.


Note (groupby / pandas 2.0): Pandas 2.0+ no longer silently drops columns that can’t be aggregated after a groupby, so code written for older pandas may behave differently or raise errors. In these practice materials we use .get() to select the column(s) we want after .groupby(...).mean() (or other aggregations) so that our solutions run on current pandas. On real exams you will not be penalized for omitting .get() when the old behavior would have produced the same answer.


Living in San Diego, we have a plethora of great food options, especially pizza!

In this exam, we’ll work with a dataset of pizza slices sold at fictional pizza shops in San Diego. Each row in the DataFrame pizza corresponds to a type of pizza slice, specific to the store in which it is sold.

The DataFrame pizza contains the following columns:

The first 10 rows of the DataFrame pizza are shown below, but the full DataFrame is much larger.

Assume that we have already run import babypandas as bpd and import numpy as np.


Problem 1


Problem 1.1

Which column of the pizza DataFrame would be an appropriate index?


Problem 1.2

Suppose you want to see if there is a relationship between num_ingredients and price. Which visualization will best enable you to investigate this?


Problem 1.3

Fill in the blanks to create a horizontal bar chart showing both the average price_per_slice and average rating for each store side by side. On a DataFrame of just the 10 rows shown in the data description, it would look like the image below

(pizza._____(i)______
      .groupby("store")._____(ii)_____
      .plot(__(iii)__));



Problem 2

Fill in the blanks below so that the expression evaluates to a float that is the highest rating of any pizza slice that has more than 10 ingredients.

pizza[_____(a)_____]._____(b)_____


Problem 2.1


Problem 2.2

What can go in blank (b)? Select all that apply.



Problem 3

First, the following code takes the first 10 rows of the pizza DataFrame and stores them into pizza10 (thus pizza10 consists of exactly the 10 rows shown on the Data Description page):

pizza10 = pizza.take(np.arange(10))

Next, consider the code below.

positions = np.arange(pizza10.shape[0])
result = positions[pizza10.get("rating") > 4].sum()

Hint: while positions is an array, the behavior of the code in the last line above is analogous to a query from a DataFrame.


Problem 3.1

What does result represent?


Problem 3.2

What does result evaluate to?



Problem 4

The Best Pizza Neighborhood Contest is coming up! For this contest, we want to find the neighborhood with the highest average rating for a slice of pizza. Fill in the blanks below so that the expression evaluates to the name of this neighborhood (as a string).

pizza.get([___(a)___]).groupby(___(b)___)
                      .___(c)___.sort_values(__(d)__, ascending=False)
                      ._____(e)_____


Problem 4.1

(a)


Problem 4.2

(b)


Problem 4.3

(c)


Problem 4.4

(d)


Problem 4.5

(e)



Problem 5

The DSC 10 staff is planning their quarterly pizza party, and they want to make sure that there are options for everyone. They’re trying to find the total number of gluten-free pizza slice options in the San Diego area. Select all lines of code that correctly evaluate to the integer corresponding to the total number of gluten-free pizza slice options.


Problem 6

At Pizza by Peter, head chefs Bianca and Ella make each pizza, either together or alone. For any given pizza, there’s a chance that the chefs mess up and the final product isn’t suitable to serve. The probabilities are given below:


Problem 6.1

(a) Suppose that Ella makes one pizza alone, Bianca makes one pizza alone, and their suitability to serve when they each work alone is independent of each other. What is the probability that at least one of the pizzas is not suitable to serve? If it is possible to solve, express your final answer as a single fraction or decimal. Note: “Not enough information” is also a possible answer choice.


Problem 6.2

(b) Let A be the event that Ella worked on the pizza (either alone or with Bianca), and B be the event that the pizza is suitable to serve.

Suppose that for a randomly selected pizza,

Using only this information and the information from the 4 bullet points at the top of the question, what is P(A \text{ or } B)? If it is possible to solve, express your final answer as a single fraction or decimal. Note: “Not enough information” is also a possible answer choice.



Problem 7

Recall from Question 3 that the DataFrame pizza10 consists of exactly the 10 rows shown on the Data Description page. Now consider the following code:

neighborhood_counts = (pizza10.groupby('neighborhood')
                              .count().get(['rating']))


Problem 7.1

What type of object is neighborhood_counts?


Problem 7.2

We want to apply a function to the neighborhood names (which are currently in the index). Fill in the blanks to complete the code below so that neighborhood_firstchar is a Series where each element is the first letter of the entire string consisting of each neighborhood name. Note that your answer to (ii) may need to contain multiple methods sequentially.

def first_letter(s):
    ____(i)____

neighborhood_firstchar = neighborhood_counts.________(ii)________



Problem 8

A student at UCSD runs a food Instagram account, @ucsdfoodeater. They go around San Diego trying pizza at different restaurants and keep notes on what kinds of pizza they have tried and where.

They keep their notes in a DataFrame called notes; in the notes DataFrame, the index is the pizza kind, and the restaurants_tried column is a list of the restaurants where they’ve tried that kind.

notes = bpd.DataFrame().assign(
    kind=["Cheese", "Pepperoni", "Veggie", "Margherita",
          "Hawaiian", "Supreme", "BBQ Chicken", "White"],
    restaurants_tried=[
        ["Jeffrey's Pizza", "Pizza Pandas"],
        ["Jeffrey's Pizza", "Pizza by Peter"],
        ["Pizza Pandas"],
        ["Pizza by Peter"],
        ["Jeffrey's Pizza"],
        ["Jeffrey's Pizza"],
        ["Regents Pizzeria"],
        ["Pizza on Pearl"]
    ]
).set_index("kind")

The notes DataFrame has 8 rows. Note that some pizza kinds in notes do not appear in pizza, and some kinds in pizza do not appear in notes.


Problem 8.1

Recall again that pizza10 was created in Question 3, containing exactly the 10 rows shown on the Data Description page. What would the following expression evaluate to?

pizza10.merge(notes, left_on="kind", right_index=True).shape[0]


Problem 8.2

Which of the following expressions evaluate to the same value as part (a)? Select all that apply.



Problem 9

DoorDash is having a special promotion! This promotion allows Ray to order four random pizza slices from the pizza DataFrame, but he only wants to eat the best slices—according to the ratings. He decides to pick the first two of those random slices and compare their ratings to the other two slices. He will eat only if the sum of the ratings of his two slices is greater than the sum of the ratings of the other two slices.

Fill in the blanks in the code below so that prob_eats evaluates to an estimate of the probability that Ray gets to eat.

repetitions = 1000
count_eats = 0

for i in np.arange(repetitions):
    promo_ratings = np.random.choice(__(a)__, 4, replace=False)
    ray_sum_of_ratings = _____(b)_____
    other_sum_of_ratings = _____(c)_____
    if ray_sum_of_ratings > other_sum_of_ratings:
        count_eats = ____(d)____
prob_eats = _____(e)_____


Problem 9.1

(a)


Problem 9.2

(b)


Problem 9.3

(c)


Problem 9.4

(d)


Problem 9.5

(e)



Problem 10

Pizza by Peter awards loyalty points: every third slice you purchase from them earns 2^i points, starting from the 1st slice. For example, the 1st, 4th, 7th and 10th slices that a customer purchases each earn 2^i loyalty points where i is 1, 4, 7 and 10 respectively.

Without using the + operator, write a one-line expression that evaluates to the total loyalty points from those four slices.


Problem 11

The management of Pizza by Peter is rebranding. The new name is the output of the following line of code. Write the new name as a string.

(pizza.get("kind").iloc[6].replace("a", "o") + "!").upper()


👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.