A Critique of Academic Processes

If a master player of a game is asked to change the game for better, that person may question if there is anything better. This skepticism is not based on a real consideration of alternative solutions to issues with the game, but based on the validated success from personal experience. Yet, a person can only have the experience of her/ his own – and have observations about others. Now, here is the catch. If a person believes in her/ his superiority with some justification, e.g., race-based/ gender-based/ publications-based/ course-evaluations-based supremacy, then it is quite unlikely to respond to the demand for change.

The push for getting publications, and the process that involves scientists as authors, editors, reviewers, faculty admins etc., and the culture within, is incentive incompatible (i.e., not compatible) with efficiency in producing high quality research. Some candidates for great research are screened out through poorly designed processes, which are justified by being the best alternatives available. Yet, those processes are often the artifacts of traditions or consequences of practicalities when technological capabilities were significantly low compared to 2020s, e.g., 1950s. I would like to elaborate through a great scholar, who is inexplicably more experienced than me.

In three commentaries, Jan-Bendict Steenkamp, a distinguished professor of marketing, discusses the illusion of perfection in research (excerpt #1 below), and how that generates reliability issues. Then quest for perfection (excerpt #2 below), which probably dominates the academic processes feed into a culture that assumes negligible errors in the process, despite systematic errors being the norm. The figure below displays the problem. Compare the often utilized 5% of type-1 error probability with the 89% of retest fail for a critical case sample tested by Open MKT, in research published at top-tier venues in marketing, e.g., Journal of Consumer Research, JCR. Imagine the scale of the problem considering other venues and out-of-sample publications. The costs of endless review processes (full post in #3 below) describes the loss in efficiency in theory and provides examples.

Courtesy for the commentaries below: “If you enjoyed this, share it with others and follow me, Jan-Benedict Steenkamp, for more writing. Journal of Marketing

Figure 1: Reliability of Research Output in Top-Tier Marketing Venues

1. REPLICATION CRISIS

“The thirst for perfection leads to p-hacking, as I wrote in a previous post. Moreover, the thirst for perfection means that papers that report statistically insignificant results will not even be submitted, let alone published, leading to the infamous file-drawer problem. The result of these two forces is that statistical significance in published research is inflated, and results may not be replicated.

I am not the first to note this. In 1966, Bakan already wrote in Psych Bull.: “that not only do journal editors reject papers in which the results are not significant, but papers in which significance has not been obtained are not submitted, that investigators select out their significant findings for inclusion in their reports, and that theory-oriented research workers tend to discard data which do not work to confirm their theories. The result of all of this is that published results are more likely to involve false rejection of null hypotheses than indicated by the stated levels of significance, that is, published results which are significant may well have Type I errors in them far in excess of, say, the 5% which we may allow ourselves.”

Sawyer and Peter (Journal of Marketing 1983) make the same observation.

How serious is the problem?
⛔ Open MKT has carried out 45 high-powered (i.e., with a larger sample), direct replications of marketing studies. Only 5 studies (11%) could be replicated.
⛔ Open Science Collaboration (Science 2015) undertook a multi-year study that replicated 100 scientific studies selected from three prominent psychology journals: Psych Sc., JPSP, and J. of Exp. Psych. The investigators found that while 97% (!) of the original studies had statistically significant results (p < .05), only 36% of replications did.
⛔ A follow-up study (Johnson et al. JASA 2017) reanalyzed these data based on a formal statistical model and concluded: “The resulting model suggests that more than 90% of tests performed in eligible psychology experiments tested negligible effects and that publication biases based on p-values caused the observed rates of nonreproducibility” (p. 1).

If these studies are not just outliers, the implications are serious. P-hacking and publication bias work together in an unholy alliance to undermine the ability to replicate published findings, which is THE hallmark of science. If we cannot replicate findings and theories, are we building a house on sand? How often have we not tried to twist and force our findings to be consistent with previous work?”

2. THE ELUSIVE SEARCH FOR PERFECTION IN ACADEMIC ARTICLES

“Try the following experiment. Take any article accepted for publication at any journal. Now submit it to another journal. What are the odds it will be accepted as is? Zero. There is even a pretty good chance it will be rejected. Our profession seemingly believes that its published articles are in fact not good enough to publish! Perhaps it is because each new reviewer can offer suggestions that will improve the article and, having been asked for their advice, feels compelled to give it. Even if true, that just returns us to the initial problem—no published article is good enough to publish. After all, every article can be improved upon. Perfection cannot be the standard. The problem seemingly lies in our inability to say an article is ‘good enough.’” This was written in an editorial by Matthew Spiegel, EIC of the top finance journal Review of Financial Studies. 

… We want our work to move the field forward in important ways and to change the behavior of managers or policy makers. We want our theory to be tight and our data to be best possible, our methodology to be robust and rigorous, our a-priori (!) hypotheses to be statistically supported at p < .05. We want our findings to be substantively meaningful and not open to rival explanations. Causality is strictly supported and endogeneity is not there… 

Now, here is the reality. If we would only publish work that checks all these boxes, any issue of a top marketing journal would perhaps contain one article, if we are lucky. 

Don’t get me wrong. I want perfection, CETERIS PARIBUS. The problem is in the ceteris paribus assumption. It never holds! The demand for perfection that characterizes our discipline (and almost every other discipline) has led to interrelated and overlapping dysfunctionalities:

⛔ p-hacking;

⛔ focus on statistical perfection (significance) rather than on substantive significance;

⛔ replication crisis;

⛔ study of narrow topics rather than big ideas;

⛔ review processes that go on and on and on. “

3. THE COSTS OF ENDLESS REVIEW PROCESSES

“The peer review process is crucial in the academic process of knowledge generation, testing, and dissemination. So, the review process is something to be cherished and nurtured. Unfortunately, all too often, the review process goes on and on. I have one paper that was published after 8 rounds. Another after 5 rounds. Another was rejected after 4 rounds. The list goes on.

It is my experience that after the first, and certainly after the second revision, the review process seldomly leads to dramatic (or even appreciable) improvement. The individual and societal welfare costs are considerable:

⛔ Especially for untenured faculty, endless review processes pose a real psychological and financial hardship. I suspect that they are also a key factor in explaining why so many faculty publish little after tenure.

⛔ Less acknowledged are the social welfare costs. All the time our field collectively spends on polishing papers without making a fundamental change in contribution, they are not working on other ideas, that can change society. Ellison (2002, p. 1025) writes in the J. of Political Economy: “Many young economists report spending as much time revising old papers as working on new ones.” This is no different for young marketing scholars.

In sum, the review process is crucial to ascertain quality. Lengthy review processes that go on and on? Much less so. Is it not time to do something about this? Dependent on the criticality of the issues (q vs. r quality in Ellison’s model), by pulling the plug (insufficient q) or accepting some imperfection (less than perfect r)?”


Conclusion


References:

Steenkamp, Jan-Benedict (January, 2025), “Replication Crisis”, LinkedIn, https://www.linkedin.com/posts/jbsteenkamp_replication-crisis-the-thirst-for-perfection-activity-7284932308858466305-ELV7, Retrieved on February 2, 2025

Steenkamp, Jan-Benedict (January, 2025), “The Elusive Search for Perfection in Academic Articles”, LinkedIn, https://www.linkedin.com/posts/jbsteenkamp_the-elusive-search-for-perfection-in-academic-activity-7274801857418428416-zW0N, Retrieved on February 2, 2025

Steenkamp, Jan-Benedict (February, 2025), https://www.linkedin.com/posts/jbsteenkamp_the-costs-of-endless-review-processes-the-activity-7292187579771174913-o-ax, Retrieved on February 3, 2025

https://openmkt.org/research/replications-of-marketing-studies, Retrieved on Feb 2, 2025


February 2, 2025 (Edited, February 5, 2025)