The plural of “study” is not “conclusion”
Good Morning! Let’s continue talking about critical thinking! This is part of my Critical Thinking series of essays – you can find the link to 1 – 15 in the comments! Let’s go!
Essay 17: Why “Studies Show” Means Almost Nothing
Has anyone ever told you that “studies show coffee prevents cancer.” And then someone else says, “studies show coffee causes cancer.”
They’re both citing actual studies. So which is it?
The answer is: you can’t know from “studies show.” You need to know which studies, what they actually measured, how big they were, and what the full body of research says.
One study with 50 people that found a correlation between coffee and lower cancer rates tells you almost nothing. Maybe it’s real. Maybe it’s random chance. Maybe there’s a confounding variable they didn’t account for. Fifty people is not enough to know.
A meta-analysis of twenty studies involving hundreds of thousands of people that controls for confounding variables and finds a consistent pattern? That tells you something.
The plural of “study” is not “conclusion.” It’s “more studies needed.”
Sample size matters enormously. If I flip a coin three times and get three heads, I haven’t proven the coin is biased. I’ve demonstrated that small samples produce weird results. But if I flip it a thousand times and get 750 heads, now we’re talking about something potentially real.
Same with studies. Small sample sizes produce unreliable results. They’re more likely to be flukes. They’re more susceptible to outliers. They can’t detect small but real effects.
When someone cites a study, first question: how many people were in it? If the answer is under 100, be very skeptical of strong claims.
Correlation is not causation. This gets repeated so much it’s become a cliché, but people still mess it up constantly.
Ice cream sales and murder rates both increase in summer. That’s correlation. Ice cream does not cause murder. Heat and more people being outside create opportunities for both ice cream purchases and violent crime. That’s a confounding variable.
When a study shows two things correlate, you need to know: is there a plausible mechanism for causation? Did they control for confounding variables? Is there any evidence of directionality – does A cause B, or does B cause A, or does C cause both?
Correlation can suggest causation. It’s a reason to investigate. It’s not proof.
“Average” can be wildly misleading depending on what kind of average you’re using and what the distribution looks like.
If nine people make $30,000 a year and one person makes $1 million, the average (mean) income is $127,000. But that doesn’t represent anyone’s actual experience. The median income – $30,000 – is much more useful here.
When someone cites an average, ask: mean or median? What’s the distribution? Are there outliers skewing the results?
Statistical significance doesn’t mean practical significance. A study can find that a drug improves outcomes by 1% and that result can be statistically significant – meaning it’s probably not due to chance – while still being practically meaningless if the side effects outweigh a 1% improvement.
“Statistically significant” means the result is unlikely to be random. It doesn’t mean the effect is large or important.
Absolute versus relative risk matters hugely. “This drug reduces your risk by 50%” sounds impressive. But if your baseline risk was 2 in 100,000, reducing it by 50% means your new risk is 1 in 100,000. The relative reduction is 50%. The absolute reduction is 0.001%.
Drug companies and supplement sellers love citing relative risk reductions because they sound more impressive than absolute risk reductions. Both can be true. One is more useful for decision-making.
P-values get misunderstood constantly. A p-value under 0.05 means there’s less than a 5% chance the result occurred by random chance. It doesn’t mean there’s a 95% chance the result is true. It doesn’t mean the result is important. It’s just one measure of statistical reliability.
Publication bias is also a massive problem. Studies with positive results get published. Studies with negative results often don’t. So the literature is skewed toward whatever shows an effect, even if most studies found nothing.
This is why meta-analyses matter – they attempt to account for publication bias and give you a more complete picture.
Replication also matters. One study finding something is interesting. Three independent studies finding the same thing is more convincing. Twenty studies with mixed results means we don’t actually know yet.
Science advances through replication and converging evidence, not through single dramatic findings.
When someone cites one study, ask: has this been replicated? What does the broader research show?
“X percent of people” can mean very different things depending on the base rate. If 1% of people have a disease, and a test is 99% accurate, and you test positive, you only have about a 50% chance of actually having the disease. That’s counterintuitive but it’s how the math works with rare conditions and imperfect tests.
Understanding base rates changes how you interpret probabilities.
The question to ask when you see statistics: What’s the sample size? Was it randomized? What are they actually measuring? How big is the effect? Has it been replicated? What’s the broader evidence?
“Studies show” is not an argument. It’s the beginning of a question: which studies, showing what, under what conditions, and do other studies agree?
Most people using statistics to make a point are cherry-picking the studies that support their conclusion. Your job is to ask about the studies they didn’t cite.