
Jan-Benedict Steenkamp dives into an issue that has long puzzled researchers—the difference between statistical and substantive significance. This isn’t just academic nitpicking; it’s a problem with real-world consequences for how research is interpreted, applied, and shared. Drawing on the wisdom of pioneers like Bakan and Tyler, Steenkamp critiques our overdependence on p-values and urges scientists to look deeper into what their results actually mean.
The Trouble with Statistical Significance
Steenkamp uses a powerful example: two studies with the same p-value but very different sample sizes. This exposes a common misunderstanding—that statistical significance automatically equals a meaningful or important finding. As Bakan noted back in 1966, a small sample size leading to statistical significance usually signals a big effect in the population. On the other hand, large samples can find tiny effects that, while statistically significant, are practically irrelevant.
This misconception illustrates a key point: just because something is statistically significant doesn’t mean it matters in the real world. Steenkamp reminds us that a p-value is simply a measure of how well the data fits with the null hypothesis; it says nothing about the practical importance of the effect. As the American Statistical Association (ASA) aptly put it: “Statistical significance is not the same as scientific, human, or economic significance.” In an age of massive datasets and automated analysis tools, Steenkamp’s warning couldn’t be more relevant.
Ralph Tyler’s Wisdom Still Holds True
Ralph Tyler’s 1931 observation remains as important as ever: a statistically significant result doesn’t always mean it’s important, and a nonsignificant result might still be meaningful. Steenkamp’s reminder of this timeless truth is a wake-up call for researchers. Too often, academic pressures prioritize statistically significant results over meaningful discussions about their practical relevance.
For example, in fields like education, medicine, or social sciences, focusing too much on p-values can lead to bad decisions. Imagine a study showing that an educational program improves test scores by just 0.1%. Sure, it’s statistically significant, but does it really matter for policymakers? Meanwhile, a nonsignificant trend suggesting a potential 10% improvement might be much more important if further research confirms it. Steenkamp’s push to highlight effect sizes—the actual magnitude of a result—addresses this issue head-on and encourages a better balance between statistics and real-world meaning.
Tackling P-Hacking with Substantive Significance
One of today’s biggest research problems is p-hacking—manipulating data or selectively reporting results to get statistically significant findings. Shifting the focus from p-values to substantive significance can help combat this. Steenkamp notes that substantive significance is harder to manipulate because it requires thoughtful analysis of the data’s broader context.
Take, for instance, a clinical trial for a new drug. If researchers only report that the drug’s effect is statistically significant (p < 0.05), they’re not giving policymakers or doctors enough to work with. But if they report the effect size (e.g., the drug reduces symptoms by 20% compared to a placebo), that’s actionable information. Effect sizes also make it easier to compare studies and build a stronger body of evidence through meta-analyses.
The ASA’s Call for Change
Steenkamp highlights the ASA’s statement on p-values as a key step toward improving how research findings are shared and understood. The message is clear: smaller p-values don’t automatically mean bigger or more meaningful effects. The ASA and Steenkamp both call for researchers to report both statistical and substantive significance, encouraging more responsible and informative practices.
Finding the Balance: Statistical and Substantive Significance Together
Steenkamp isn’t saying we should throw out statistical significance entirely. Instead, he’s asking for a more balanced approach. Statistical tests are useful for determining if findings are reliable, but their true value comes when they’re combined with measures like effect sizes, confidence intervals, and practical implications.
Here’s what needs to happen:
- Report Effect Sizes: Always include effect sizes alongside p-values to show the real impact of findings.
- Discuss Practical Relevance: Explain how the results matter in the real world instead of overhyping small, statistically significant effects.
- Encourage Replication: Promote repeated studies to confirm both statistical and substantive significance, helping separate meaningful findings from noise.
- Educate Stakeholders: Train researchers, reviewers, and decision-makers to understand the limits of p-values and the importance of effect sizes.
A Call to Action for Researchers
Steenkamp’s critique is a timely reminder of the responsibility researchers have in making sure their findings are meaningful and clearly communicated. By revisiting lessons from Bakan and Tyler and emphasizing effect sizes, Steenkamp lays out a path toward more impactful research.
This isn’t just about improving statistical practices—it’s about making research matter. In today’s data-driven world, the distinction between statistical and substantive significance is more crucial than ever. Steenkamp’s message is a challenge and an inspiration: to pursue rigor, relevance, and responsibility in scientific work.
Dr. Prahlada N.B
MBBS (JJMMC), MS (PGIMER, Chandigarh).
MBA in Healthcare & Hospital Management (BITS, Pilani),
Postgraduate Certificate in Technology Leadership and Innovation (MIT, USA)
Executive Programme in Strategic Management (IIM, Lucknow)
Senior Management Programme in Healthcare Management (IIM, Kozhikode)
Advanced Certificate in AI for Digital Health and Imaging Program (IISc, Bengaluru).
Senior Professor and former Head,
Department of ENT-Head & Neck Surgery, Skull Base Surgery, Cochlear Implant Surgery.
Basaveshwara Medical College & Hospital, Chitradurga, Karnataka, India.
My Vision: I don’t want to be a genius. I want to be a person with a bundle of experience.
My Mission: Help others achieve their life’s objectives in my presence or absence!
My Values: Creating value for others.
Leave a reply
Dear Dr. Prahlada N B Sir,
Your blog post, "Understanding Statistical and Substantive Significance: Why It Matters," is a masterclass in clarity and insight. Like a skilled navigator, you expertly guide readers through the choppy waters of statistical analysis, highlighting the crucial distinction between statistical and substantive significance.
Your use of anecdotes and examples is particularly effective, making complex concepts accessible to a broad audience. The story of two studies with the same p-value but vastly different sample sizes is a powerful illustration of the limitations of statistical significance. It's a reminder that, in the words of Mark Twain, "There are three kinds of lies: lies, damned lies, and statistics."
Your emphasis on the importance of substantive significance resonates deeply. As you so eloquently put it, "Statistical significance is not the same as scientific, human, or economic significance." This is a crucial message for researchers, policymakers, and practitioners alike.
I'm also struck by your nuanced discussion of the American Statistical Association's statement on p-values. Your analysis of the implications of this statement for research practice is thoughtful and provocative.
In short, your blog post is a tour de force, offering a compelling blend of statistical sophistication, real-world relevance, and clear communication. It's a must-read for anyone interested in research methodology, statistical analysis, or evidence-based decision-making.
Thank you for sharing your expertise and insights with us.
Reply