Last Monday the Center for Open Science (COS) formally launched TOP Factor, an alternative to the Journal Impact Factor (JIF) based primarily on the Transparency and Openness Promotion (TOP) Guidelines. For full and early disclosure, I’m Vice Chair of the COS Board but my interest here is in situating this new initiative in the longer history of attempts to shift research assessment – and in other words, to fundamentally change research culture.
The need for fundamental culture change
A recently released Wellcome Trust report, What Researchers Think About the Culture They Work In, makes sobering reading. Collectively, it paints a “shocking portrait of the research environment” according to Wellcome’s Director. A system that puts more value on research metrics and quantity than quality has led to unhealthy competition, bullying, harassment, and mental health issues.
“These results paint a shocking portrait of the research environment – and one we must all help change. The pressures of working in research must be recognised and acted upon by all, from funders, to leaders of research and to heads of universities and institutions. As a funder, we understand that our own approach has played a role. We’re committed to changing this, to foster a creative, supportive, and inclusive research environment.” Jeremy Farrar, Director of Wellcome
Rapid and wide-ranging improvement is required not only for researchers themselves but also because of the ways in which this hyper-competitive environment reduces both the time available for thinking creatively, the likelihood that scientists will take risks to pursue their most imaginative ideas, and in some cases, the reliability of results.
At the core of the challenges facing the research enterprise is a broken incentive system rewarding novelty and publication in a small number of highly selective journals. 29% of Wellcome survey respondents felt that publishers have a “high responsibility” for changing this culture with another 42% believing that we have a “medium responsibility”. But, as noted in publication of the original TOP Guidelines, the situation is a classic collective action problem. Individual researchers lack strong incentives to be more transparent (and in fact, are rewarded for quite the opposite). Yet there is no centralized means of aligning individual and community incentives. Universities, funders, and publishers each unwittingly align to maintain the status quo because all of their incentive systems rely on journal name and/or quantity of publication to evaluate, fund, and promote researchers.
The case against JIF has been clear for many years: primarily that a factor derived from citations to all articles in a journal cannot tell us anything about the quality of any specific article, nor of the quality of work of any specific author. These points become even more evident when we understand the ways in which JIF can be manipulated (for example, by publishing more review articles). Despite all of these evident limitations (and the fact that JIF is widely accepted to be flawed by many different types of publisher), JIF remains highly influential.
There have been a number of bold attempts to reduce reliance on JIF, or at least to provide alternatives for consideration alongside JIF. Most notably, the launch of the Declaration on Research Assessment (DORA) in 2013 aimed to promote robust and efficient ways of evaluating research and researchers that do not rely on JIF. While the DORA declaration has garnered an impressive number of signatories (1,870 organizations and 15,562 individuals), its ability to deliver meaningful change has been frustratingly slow (no doubt influenced by the powerful allure of the status quo).
The development of alternative metrics – altmetrics – has also sought to provide a viable alternative or complement to JIF. The most important advance with altmetrics is that they are article, rather than journal, based. There is clear evidence of the growing use of altmetrics – for example, within the UK Research Excellence Framework as demonstration of wider public reach. Altmetric themselves have interesting cases of how social media activity can support promotion and tenure, while other research suggests that the online activity measured by altmetrics, may drive citations.
Along comes TOP Factor
Development of TOP Factor followed the work of the original TOP Guidelines committee published in 2015. At the outset, the committee established the eight standards and levels of stringency that correspond to the scoring (and which are maintained by a standing committee). The development of TOP Factor scoring has occurred over the last year internally at COS through testing and maturing a rubric for evaluating journal policies for adherence to the TOP Guidelines. The TOP Guidelines themselves now have over 5,000 signatories – organizations and journals that are expressing support for the principles — along with about 1,140 journals that have TOP-compliant policies (although the vast majority minimally so).
TOP Factor assesses journal policies for the degree to which they promote core scholarly norms of transparency and reproducibility. The hope is that this new alternative to JIF may reduce the dysfunctional incentives for journals to publish exciting results with little regard for their reliability. Authors can use TOP Factor to identify journals that have policies aligned with their values and credit their effort to be more rigorous and transparent. Funders can use TOP Factor to assess which journals are most likely to support their policy mandates for grantees. And publishers can use TOP Factor to identify journals with progressive policies for inspiration, and monitor trends in policies by discipline. There are some clear benefits to this approach:
- Perhaps most importantly, this is the first rating that focuses on something other than the novelty or newsworthiness of results. (New discovery is of course central to the scientific process but focus on novelty alone has too often led to bold claims of groundbreaking advances that cannot be fully scrutinized because so much – including the peer review itself – remains closed.)
- It is a rating, not a ranking. (And in fact, seeing how their competitors are doing may encourage journals to improve their own practices.)
- At the TOP Factor website, users can filter scores on a number of different dimensions which is important because practices are evolving differently in different domains (for example, economics is leading the charge in requiring transparency of data and code where psychology has been more assertive in promoting preregistration).
- The scoring process and data behind TOP Factor are freely available and verifiable on the website (although the scoring itself still retains some level of subjective judgment).
COS is clear that TOP Factor is not a magic bullet and that its role is to complement other efforts to improve research culture and practice. But there are clearly limitations:
- Most significantly, it only evaluates stated policies – text on a journal website — but not enforcement of those policies, leaving it open to abuse by those who may simply state policies strongly but do little to enforce them effectively. To say nothing of the differential way in which journals implement the “same” policy. (This is one reason why the Research Data Alliance has been pushing for more standardized approaches to policy features and wording, and use of simple “action item” lists in those policies making clear when things are mandatory and enforced and when they are not.)
- It has the same limitation as JIF in that it is a journal-level metric and therefore does not have a direct implication about the transparency or rigor of any single article.
- A journal’s TOP Factor score doesn’t solve the problem of measuring the quality of research published in that journal. As Brian Nosek, COS’s Executive Director, notes,“It is important to remember that research can be completely transparent and terrible at the same time. Policies promoting transparency and reproducibility make it easier to evaluate research quality.”
Can TOP Factor really shift the culture?
The real question about TOP Factor – and any other attempt to reform research culture – is whether anyone who matters (promotion and tenure committees, funding agencies, etc.) will care about or use it. The lesson of past attempts is that this is an incredibly tough nut to crack. While DORA has been hugely successful in gaining signatories, it has been hard to get those signatories to actually change behavior – and that’s because behavior change a really hard.
Brian Nosek’s strategy for culture change acknowledges that “people are embedded in social and cultural systems” and that “those systems shape behavior by communicating norms…providing incentives…and imposing policies”. As such, any successful change strategy must focus on comprehensive, systemic reform and not simply individual behavior. One of the most powerful aspects of this model is the focus on disciplinary communities, which are, after all, where research culture is established and maintained.
Changes in research practice and publication in psychology provide a compelling example of the community-driven approach. Psychology’s replication challenges have been garnering headlines for a number of years. As a result, both leading scientists and professional societies have raised their expectations for transparency and rigor. These efforts have led to measurable changes in behavior, evidenced by:
- The number of articles in Psychological Science, the leading APS journal, that now have one or more badges since launch in 2013. In 2019, 65% of articles were badged for open data, 50% for open materials and 28% for preregistration. (Data from D.S. Lindsay, EIC of Psychological Science, 2015-2019)
- The level of market penetration of the Open Science Framework in psychology. In a recent crowd sourced survey, 35% of research faculty were found to have an OSF account (on a sample size of 69 departments).
- The rapid growth in adoption of registered reports.
Chris Chambers has documented the lessons he learned from the frontlines of the registered reports revolution. One of his key takeaways is that “should” arguments fail because they offer only judgment, not solutions. Registered reports break this stalemate by “turning the pursuit of high quality science into a virtuous transaction”. Earlier feedback in the research process helps to increase the robustness and reproducibility of a study, and guarantees and outcome-neutral assessment and publication of the final research. But perhaps Chris’s most powerful observation is that the need for culture change before any reforms can successfully be adopted is a red herring. Rather:
“In academia, culture is the shadow created by the machine of rules, norms, mandates and incentives that drive everyday decisions. If we want to fix the machine, it makes no sense to direct our efforts at the shadow. We must instead replace the parts, one by one, and eventually – if necessary – the entire machine. If we succeed, the culture will have changed, but only because we changed everything else.”
Perhaps one of the primary reasons why JIF retains its primacy is that it’s such as easy cognitive shortcut – I can make assumptions about papers and researchers without a lot of work. And this is also why it’s so hard to replace – in fact, impossible to replace with anything meaningful for why should we expect any single metric to provide a reliable summary of an inherently subjective judgment? TOP Factor does contribute a new way to measure choice by researchers and along with other tools such as the TOP Guidelines, expands the models and tools we have to work with. By themselves, none of these initiatives is wholesale reform but they should be welcomed as pieces of multi-pronged attempts to nudge incentives in the publication process in the right direction, creating a publication process and system that is aligned with the true values of research and researchers.