← return to practice.dsc10.com
These problems are taken from past quizzes and exams. Work on them
on paper, since the quizzes and exams you take in this
course will also be on paper.
We encourage you to complete these
problems during discussion section. Solutions will be made available
after all discussion sections have concluded. You don’t need to submit
your answers anywhere.
Note: We do not plan to cover all of
these problems during the discussion section; the problems we don’t
cover can be used for extra practice.
In the ikea
DataFrame, the first word of each string in
the 'product'
column represents the product line. For
example the HEMNES line of products includes several different products,
such as beds, dressers, and bedside tables.
The code below assigns a new column to the ikea
DataFrame containing the product line associated with each product.
= ikea.get('product')
(ikea.assign(product_line apply(extract_product_line))) .
What are the input and output types of the
extract_product_line
function?
takes a string as input, returns a string
takes a string as input, returns a Series
takes a Series as input, returns a string
takes a Series as input, returns a Series
Complete the return statement in the
extract_product_line
function below.
For example,
extract_product_line('HEMNES Daybed frame with 3 drawers, white, Twin')
should return 'HEMNES'
.
def extract_product_line(x):
return _________
What goes in the blank?
Complete the implementation of the to_minutes
function
below. This function takes as input a string formatted as
'x hr, y min'
where x
and y
represent integers, and returns the corresponding number of minutes,
as an integer (type int
in Python).
For example, to_minutes('3 hr, 5 min')
should return
185.
def to_minutes(time):
= time.split(' hr, ')
first_split = first_split[1].split(' min')
second_split return _________
What goes in the blank?
Consider the function tom_nook
, defined below. Recall
that if x
is an integer, x % 2
is
0
if x
is even and 1
if
x
is odd.
def tom_nook(crossing):
= 0
bells for nook in np.arange(crossing):
if nook % 2 == 0:
= bells + 1
bells else:
= bells - 2
bells return bells
What value does tom_nook(8)
evaluate to?
-6
-4
-2
0
2
4
6
The DataFrame evs
consists of 32 rows, each of which
contains information about a different EV model.
The first few rows of evs
are shown below.
We also have a DataFrame that contains the distribution of
“BodyStyle” for all “Brands” in evs
, other than Nissan.
Suppose we’ve run the following few lines of code.
= evs[evs.get("Brand") == "Tesla"]
tesla = evs[evs.get("Brand") == "BMW"]
bmw = evs[evs.get("Brand") == "Audi"]
audi
= tesla.merge(bmw, on="BodyStyle").merge(audi, on="BodyStyle") combo
How many rows does the DataFrame combo
have?
21
24
35
65
72
96
The sums
function takes in an array of numbers and
outputs the cumulative sum for each item in the array. The cumulative
sum for an element is the current element plus the sum of all the
previous elements in the array.
For example:
>>> sums(np.array([1, 2, 3, 4, 5]))
1, 3, 6, 10, 15])
array([>>> sums(np.array([100, 1, 1]))
100, 101, 102]) array([
The incomplete definition of sums
is shown below.
def sums(arr):
= _________
res
(a)= np.append(res, arr[0])
res for i in _________:
(b)= np.append(res, _________)
res
(c)return res
Fill in blank (a).
Fill in blank (b).
Fill in blank (c).
Teresa and Sophia are bored while waiting in line at Bistro and decide to start flipping a UCSD-themed coin, with a picture of King Triton’s face as the heads side and a picture of his mermaid-like tail as the tails side.
Teresa flips the coin 21 times and sees 13 heads and 8 tails. She
stores this information in a DataFrame named teresa
that
has 21 rows and 2 columns, such that:
The "flips"
column contains "Heads"
13
times and "Tails"
8 times.
The "Wolftown"
column contains "Teresa"
21 times.
Then, Sophia flips the coin 11 times and sees 4 heads and 7 tails.
She stores this information in a DataFrame named sophia
that has 11 rows and 2 columns, such that:
The "flips"
column contains "Heads"
4
times and "Tails"
7 times.
The "Makai"
column contains "Sophia"
11
times.
How many rows are in the following DataFrame? Give your answer as an integer.
="flips") teresa.merge(sophia, on
Hint: The answer is less than 200.
Let A be your answer to the previous part. Now, suppose that:
teresa
contains an additional row, whose
"flips"
value is "Total"
and whose
"Wolftown"
value is 21.
sophia
contains an additional row, whose
"flips"
value is "Total"
and whose
"Makai"
value is 11.
Suppose we again merge teresa
and sophia
on
the "flips"
column. In terms of A, how many rows are in the new merged
DataFrame?
A
A+1
A+2
A+4
A+231
In recent years, there has been an explosion of board games that teach computer programming skills, including CoderMindz, Robot Turtles, and Code Monkey Island. Many such games were made possible by Kickstarter crowdfunding campaigns.
Suppose that in one such game, players must prove their understanding
of functions and conditional statements by answering questions about the
function wham
, defined below. Like players of this game,
you’ll also need to answer questions about this function.
1 def wham(a, b):
2 if a < b:
3 return a + 2
4 if a + 2 == b:
5 print(a + 3)
6 return b + 1
7 elif a - 1 > b:
8 print(a)
9 return a + 2
10 else:
11 return a + 1
What is printed when we run print(wham(6, 4))
?
Give an example of a pair of integers a
and
b
such that wham(a, b)
returns
a + 1
.
Which of the following lines of code will never be executed, for any input?
3
6
9
11
We’ll be looking at a DataFrame named sungod
that
contains information on the artists who have performed at Sun God in
years past. For each year that the festival was held, we have
one row for each artist that performed that year. The columns
are:
'Year'
(int
): the year of the
festival'Artist'
(str
): the name of the
artist'Appearance_Order'
(int
): the order in
which the artist appeared in that year’s festival (1 means they came
onstage first)The rows of sungod
are arranged in no particular
order. The first few rows of sungod
are shown
below (though sungod
has many more rows
than pictured here).
Assume:
'Year'
of 2015 and an
'Appearance_Order'
of 3).import babypandas as bpd
and
import numpy as np
. Fill in the blank in the code below so that
chronological
is a DataFrame with the same rows as
sungod
, but ordered chronologically by appearance on stage.
That is, earlier years should come before later years, and within a
single year, artists should appear in the DataFrame in the order they
appeared on stage at Sun God. Note that groupby
automatically sorts the index in ascending order.
= sungod.groupby(___________).max().reset_index() chronological
['Year', 'Artist', 'Appearance_Order']
['Year', 'Appearance_Order']
['Appearance_Order', 'Year']
None of the above.
Another DataFrame called music
contains a row for every
music artist that has ever released a song. The columns are:
'Name'
(str
): the name of the music
artist'Genre'
(str
): the primary genre of the
artist'Top_Hit'
(str
): the most popular song by
that artist, based on sales, radio play, and streaming'Top_Hit_Year'
(int
): the year in which
the top hit song was releasedYou want to know how many musical genres have been represented at Sun
God since its inception in 1983. Which of the following expressions
produces a DataFrame called merged
that could help
determine the answer?
merged = sungod.merge(music, left_on='Year', right_on='Top_Hit_Year')
merged = music.merge(sungod, left_on='Year', right_on='Top_Hit_Year')
merged = sungod.merge(music, left_on='Artist', right_on='Name')
merged = music.merge(sungod, left_on='Artist', right_on='Name')
Consider an artist that has only appeared once at Sun God. At the time of their Sun God performance, we’ll call the artist
Complete the function below so it outputs the appropriate description for any input artist who has appeared exactly once at Sun God.
def classify_artist(artist):
= merged[merged.get('Artist') == artist]
filtered = filtered.get('Year').iloc[0]
year = filtered.get('Top_Hit_Year').iloc[0]
top_hit_year if ___(a)___ > 0:
return 'up-and-coming'
elif ___(b)___:
return 'outdated'
else:
return 'trending'
What goes in blank (a)?
What goes in blank (b)?
King Triton, UCSD’s mascot, is quite the traveler! For this question,
we will be working with the flights
DataFrame, which
details several facts about each of the flights that King Triton has
been on over the past few years. The first few rows of
flights
are shown below.
Here’s a description of the columns in flights
:
'DATE'
: the date on which the flight occurred. Assume
that there were no “redeye” flights that spanned multiple days.'FLIGHT'
: the flight number. Note that this is not
unique; airlines reuse flight numbers on a daily basis.'FROM'
and 'TO'
: the 3-letter airport code
for the departure and arrival airports, respectively. Note that it’s not
possible to have a flight from and to the same airport.'DIST'
: the distance of the flight, in miles.'HOURS'
: the length of the flight, in hours.'SEAT'
: the kind of seat King Triton sat in on the
flight; the only possible values are 'WINDOW'
,
'MIDDLE'
, and 'AISLE'
. Suppose we create a DataFrame called socal
containing
only King Triton’s flights departing from SAN, LAX, or SNA (John Wayne
Airport in Orange County). socal
has 10 rows; the bar chart
below shows how many of these 10 flights departed from each airport.
Consider the DataFrame that results from merging socal
with itself, as follows:
= socal.merge(socal, left_on='FROM', right_on='FROM') double_merge
How many rows does double_merge
have?
We define a “route” to be a departure and arrival airport pair. For
example, all flights from 'SFO'
to 'SAN'
make
up the “SFO to SAN route”. This is different from the “SAN to SFO
route”.
Fill in the blanks below so that
most_frequent.get('FROM').iloc[0]
and
most_frequent.get('TO').iloc[0]
correspond to the departure
and destination airports of the route that King Triton has spent the
most time flying on.
= flights.groupby(__(a)__).__(b)__
most_frequent = most_frequent.reset_index().sort_values(__(c)__) most_frequent
What goes in blank (a)?
What goes in blank (b)?
count()
mean()
sum()
max()
What goes in blank (c)?
by='HOURS', ascending=True
by='HOURS', ascending=False
by='HOURS', descending=True
by='DIST', ascending=False
We define the seasons as follows:
Season | Month |
---|---|
Spring | March, April, May |
Summer | June, July, August |
Fall | September, October, November |
Winter | December, January, February |
We want to create a function date_to_season
that takes
in a date
as formatted in the 'DATE'
column of
flights
and returns the season corresponding to that date.
Which of the following implementations of date_to_season
works correctly? Select all that apply.
Option 1:
def date_to_season(date):
= int(date.split('-')[1])
month_as_num if month_as_num >= 3 and month_as_num < 6:
return 'Spring'
elif month_as_num >= 6 and month_as_num < 9:
return 'Summer'
elif month_as_num >= 9 and month_as_num < 12:
return 'Fall'
else:
return 'Winter'
Option 2:
def date_to_season(date):
= int(date.split('-')[1])
month_as_num if month_as_num >= 3 and month_as_num < 6:
return 'Spring'
if month_as_num >= 6 and month_as_num < 9:
return 'Summer'
if month_as_num >= 9 and month_as_num < 12:
return 'Fall'
else:
return 'Winter'
Option 3:
def date_to_season(date):
= int(date.split('-')[1])
month_as_num if month_as_num < 3:
return 'Winter'
elif month_as_num < 6:
return 'Spring'
elif month_as_num < 9:
return 'Summer'
elif month_as_num < 12:
return 'Fall'
else:
return 'Winter'
Option 1
Option 2
Option 3
None of these implementations of date_to_season
work
correctly
Assuming we’ve defined date_to_season
correctly in the
previous part, which of the following lines of code correctly computes
the season for each flight in flights
?
date_to_season(flights.get('DATE'))
date_to_season.apply(flights).get('DATE')
flights.apply(date_to_season).get('DATE')
flights.get('DATE').apply(date_to_season)