← return to practice.dsc10.com
This quiz was administered in-person. It was closed-book and
closed-note; students were not allowed to use the DSC
10 Reference Sheet. Students had 20 minutes to work on
the quiz.
This quiz covered Lectures 13, 15, and 16 of the Spring 2024 offering
of DSC 10.
Which of the following statements are true in general? Select all that apply.
Parameters are fixed, but statistics can change depending on the sample.
Parameters and statistics can both fluctuate depending on the sample.
For simple random samples, statistics give better estimates of parameters when the sample size is larger.
The distribution of a statistic is the same regardless of the sample size.
None of the above.
Options 1 and 3
The average score on this problem was 90%.
The DataFrame restaurants
contains information about a
sample of restaurants in San Diego County. We have each restaurant’s
"name" (str)
, "rating" (int)
, average
"meal_price" (float)
, and type of
"cuisine" (str)
, such as "Thai"
or
"Italian"
.
You are interested in estimating the average
"meal_price"
across all Italian restaurants in San Diego
County using only the data in restaurants
. Fill in the
following code so that italian_means
evaluates to an array
of 1000 bootstrapped estimates for this parameter.
def bootstrap_means(data, n_samples):
= np.array([])
means for i in range(n_samples):
= data.sample(__(a)__, replace = __(b)__)
resample = np.append(means, __(c)__)
means return means
= __(d)__
italian_restaurants = bootstrap_means(italian_restaurants, __(e)__) italian_means
(a): data.shape[0]
(b): True
(c): resample.get("meal price").mean()
(d):
restaurants[restaurants.get("cuisine") == "Italian"]
(e): 1000
The average score on this problem was 73%.
Next, fill in the blanks below so that italian_CI
evaluates to an 88% bootstrapped confidence interval for the average
"meal_price"
across all Italian restaurants in San Diego
County.
= np.percentile(italian_means, __(a)__)
lower_bound = np.percentile(italian_means, __(b)__)
upper_bound = [lower_bound, upper_bound] italian_CI
(a): 6
(b): 94
The average score on this problem was 83%.
Suppose italian_CI
evaluates to [25, 35]. Which of the
following statements are correct? Select all that apply.
If we randomly selected 1000 Italian restaurants from the population
of Italian restaurants in San Diego County, about 880 of them will have
an average "meal_price"
between $25 and $35.
There is an 88% chance that the average "meal_price"
of
Italian restaurants in San Diego County falls between $25 and $35.
88% of all Italian restaurants have an average
"meal_price"
between $25 and $35.
None of the above.
Option 4: None of the above.
The average score on this problem was 64%.
Which of the following can be used to generate a simple
random sample of "rating"
s from 10 restaurants in
restaurants
? Select all that apply.
Option 1:
sample = restaurants.take(np.arange(10)).get("rating")
Option 2:
sample = restaurants.sample(10, replace = False).get("rating")
Option 3:
sample = restaurants.sample(10, replace = True).get("rating")
Option 4:
positions = np.random.choice(np.arange(0, restaurants.shape[0]),
10, replace = False)
sample = restaurants.take(positions).get("rating")
Option 5:
positions = np.random.choice(np.arange(0, restaurants.shape[0]),
10, replace = True)
sample = restaurants.take(positions).get("rating")
Option 1
Option 2
Option 3
Option 4
Option 5
Options 2 and 4
The average score on this problem was 65%.