Discussion 1: Association and Causality

← return to practice.dsc10.com


The problems in this worksheet are taken from past exams. Work on them on paper, since the exams you take in this course will also be on paper.

We encourage you to complete this worksheet in a discussion section, which are held live on Monday, September 26th. Solutions will be made available after all discussion sections have concluded. You don’t need to submit your answers anywhere.


Correlation is not causation.


Problem 1

Which of the following questions could not be answered by running a randomized controlled experiment?

Answer: Does drug abuse lead to a shorter life span?

It would be unethical to try to run a randomized controlled experiment to address the question of whether drug abuse leads to a shorter life span, as this would involve splitting participants into groups and telling one group to abuse drugs. This is problematic because we know drug abuse brings about a host of problems, so we could not ethically ask people to harm themselves.

Notice that the first proposed study, about the impacts of citrus fruits on heart disease, does not involve the same kind of ethical dilemma because we’re not forcing people to do something known to be harmful. A randomized controlled experiment would involve splitting participants into two groups and asking one group to eat citrus fruits, and measuring the heart health of both groups. Since there are no known harmful effects of eating citrus fruits, there is no ethical issue.

Similarly, we could run a randomized controlled trial by giving an exam where some students had to sign an integrity pledge and others didn’t, tracking the number of reported dishonesty cases in each group. Likewise, we could reward some students for good grades and not others, and keep track of high school graduation rates in each group. Neither of these studies would involve knowingly harming people and could reasonably be carried out.



Difficulty: ⭐️⭐️

The average score on this problem was 85%.


Problem 2

The following is a quote from The New York Times’ The Morning newsletter.

As Dr. Ashish Jha, the dean of the Brown University School of Public Health, told me this weekend: “I don’t actually care about infections. I care about hospitalizations and deaths and long-term complications.”

By those measures, all five of the vaccines — from Pfizer, Moderna, AstraZeneca, Novavax and Johnson & Johnson — look extremely good. Of the roughly 75,000 people who have received one of the five in a research trial, not a single person has died from Covid, and only a few people appear to have been hospitalized. None have remained hospitalized 28 days after receiving a shot.

To put that in perspective, it helps to think about what Covid has done so far to a representative group of 75,000 American adults: It has killed roughly 150 of them and sent several hundred more to the hospital. The vaccines reduce those numbers to zero and nearly zero, based on the research trials.

Zero isn’t even the most relevant benchmark. A typical U.S. flu season kills between five and 15 out of every 75,000 adults and hospitalizes more than 100 of them.

Why does the article use a representative group of 75,000 American adults?

Answer: Comparison. It allows for quick comparison against the group of people who got the vaccine in a trial.

The purpose of the article is to compare Covid outcomes among two groups of people: the 75,000 people who got the vaccine in a research trial and a representative group of 75,000 American adults. Since 75,000 people got the vaccine in a research trial, we need to compare statistics like number of deaths and hospitalizations to another group of the same size for the comparison to be meaningful.

There is no convention about using 75,000 for rates. This number is used because that’s how many people got the vaccine in a research trial. If a different number of people had been vaccinated in a research trial, the article would have taken that number of adults in their representative comparison group.

75,000 is quite a large number and most people probably don’t have a sense of the scale of 75,000 people. If the goal were comprehension, it would have made more sense to use a smaller number like 100 people.

The number 75,000 is not arbitrary. It was chosen as the size of the representative group specifically to equal the number of people who got the vaccine in a research trial.



Difficulty: ⭐️

The average score on this problem was 91%.


Problem 3

In response to the pandemic, some airlines chose to leave middle seats empty, while others continued seating passengers in middle seats. Let’s suppose Delta did not seat passengers in middle seats during the pandemic, and United did seat passengers in middle seats during the pandemic.

Delta wants to know whether customers were satisfied with them for making this decision not to use middle seats. Suppose they have access to a dataset of customer satisfaction surveys, taken annually for each airline. How can Delta determine whether its new seating policy caused an increase in customer satisfaction?

Answer: None of the above.

None of the options isolate the effect of the seating policy because they do not use randomized controlled trials. Even measuring the change in each airline’s average satisfaction rating as described in the third option is insufficient because we don’t know whether any differences are due to the changed seating policy or other changes. It’s possible that many things changed around the time of the pandemic, and we have no way of separating the effects of each of these changes. For example, maybe United stopped serving snacks during the pandemic and Delta continued serving snacks, at around the same time as the seating changes went into effect. If we find a difference in average customer satisfaction between the airlines, we have no way of knowing whether it’s because of the differences in seating policies or snack policies (or something else).



Difficulty: ⭐️⭐️⭐️⭐️⭐️

The average score on this problem was 13%.


Problem 4

The WNBA is interested in helping boost their players’ social media presence, and considers various ways of making that happen.

Which of the following claims can be tested using a randomized controlled trial? Select all that apply.

Answers:

  • Drinking Gatorade causes a player to gain Instagram followers.
  • Deleting Twitter causes a player to gain Instagram followers.

The key to this problem is understanding the nature of randomized controlled trials (RCTs). To run an RCT, we must be able to randomly assign our test subjects to either a treatment or control group, and apply some treatment only to the treatment group. With that in mind, let’s look at the four options.

  • Option 1: Here, the treatment would be winning two games in a row. However, this is not something we could apply to a treatment group, since we can’t control whether or not basketball players win games (let alone two games in a row). As such, this is not an RCT we could run.
  • Option 2: Here, the treatment would be drinking Gatorade. We indeed could have a treatment group drink Gatorade; this is no different than having a treatment group take some form of medicine. This is an RCT we could run.
  • Option 3: Here, the treatment would be playing for the Las Vegas Aces. Similar to Option 1, this is not an RCT we could run, since we can’t control which teams players play for.
  • Option 4: Here, the treatment would be deleting Twitter. While they may not like it, we could have players participating in our study delete their Twitter accounts, so this is an RCT we could run.

Note, our assessment above did not look at the outcome, gaining Instagram followers, at all. While to us it may seem unlikely that drinking Gatorade causes a player to gain Instagram followers, that doesn’t mean we can’t run an RCT to check.



Difficulty: ⭐️⭐️

The average score on this problem was 77%.


Problem 5

Every spring quarter since 1990, UCSD has held the Sun God Festival. This daytime festival is considered to be one of the best college music festivals since it has featured major artists like Kendrick Lamar, Drake, Macklemore, Snoop Dogg, and the Black Eyed Peas.



In 2014, the UCSD administration made some important changes to Sun God policies, including:

  1. eliminating guest tickets for non-students,
  2. increasing security, and
  3. introducing on-site medical care.

These changes were implemented because of incidents related to drug and alcohol abuse at the festival. At the 2013 Sun God festival, 48 students were hospitalized, and at the 2014 festival, only 8 students were hospitalized. Assuming there was no change in the total number of attendees from 2013 to 2014, which of the following statements is correct?

Answer: There is an association between administrative changes and hospitalizations. We can’t be sure if any of the administrative changes are responsible for the reduction in hospitalizations.

We know there is an association between administrative changes and hospitalizations because the number of hospitalized students dropped after the changes went into effect.

However, since no randomized controlled trial was done, we can’t be sure of the reason for the reduction in hospitalizations. For example, maybe there were fewer hospitalizations because a new flavor of sparkling water came out in 2014, and people drank that instead of alcohol. We just don’t know enough to conclude any causal explanation for the reduction in hospitalizations.



Difficulty: ⭐️⭐️⭐️

The average score on this problem was 54%.


Problem 6


Problem 6.1

Propose a scenario where a randomized controlled trial (RCT) would be unethical.

Possible answer: Conducting an experiment where researchers want to study if smoking in mothers during pregnancy negatively affects brain development in their children.

Although theoretically possible with an RCT to prove causation, an RCT would be highly unethical and should not be conducted because randomly assigning mothers to a smoking group would be detrimental for the mother and her child. An observational study would be more ethical in this scenario.


Problem 6.2

Propose a scenario where a randomized controlled trial (RCT) would be impractical (impossible to run, but not for ethical reasons).

Possible answer: Conducting an experiment to determine if winning the lottery increases happiness in middle-aged adults.

It is not possible to provide the treatment (winning the lottery) as this is something that happens by chance, and is not something we can control. When we cannot control how participants receive a treatment, a RCT becomes impractical.



👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.