How Do We Make Research Assessment More Responsible? - A Multi-stakeholder Discussion

On January 27th, a diverse group of nearly 30 senior researchers, university administrators, funders, publishers, and representatives from other organizations with an interest in research assessment met via a virtual meeting organized by the Society for Scholarly Publishing (SSP) Publisher-Funder Task Force to talk about Responsible Research Assessment for the 21st Century. I was honored to be asked to moderate this, the first of two small group discussions on the theme of From Stuck Places to Solutions. The second one, which will take place later in February, will focus on the role of funders in the transition to open research.

While there is growing recognition of the need to improve and broaden the way research is assessed and evaluated, the question of how to go about transforming such a complex, multi-stakeholder system is a matter of some contention. With that in mind, our ambition was to create a safe space for frank discussion of the issues. The discussion was therefore conducted under strict Chatham House Rule. In keeping with that, please be aware that the post below is simply a report on the discussion. There will be no attribution and none of the opinions expressed below necessarily represent my opinions, or those of any particular attendee.

Is there even a problem?

The first thing I asked our attendees, after a brief round of introductions, was whether they thought that research assessment was currently being done in a responsible way. What we found was very telling. In our anonymous poll, most people answered ‘no’ or ‘I’m not sure’. Only one person felt confident that research is currently being assessed in a responsible way.

Why should we care about the quality of research assessment?

With such a diverse group of stakeholders from across the research ecosystem, it was perhaps unsurprising that they had a wide range of viewpoints on why research assessment is so important. During our discussion, the ideas centered around three main areas of motivation: 1) That responsible research assessment is critical for public trust in science; 2) That there’s a need to identify the best research so that the available funding does the most good; and 3) That it’s important to equitably evaluate the work of individual researchers and remove traditional barriers to participation.

Issues around choosing the right research to fund are somewhat related to earning and keeping public trust. It’s important that errors, poor practice, and downright fraud are identified. By the same token, better selection of good ideas, and rigorous, robust approaches for funding should lead to better outcomes, more knowledge and greater impact of research.

On the other hand, our third idea, that of fairness to individuals, might seem to be an orthogonal motivation. That is, not exactly contradictory, but a separate question entirely. So, is it a separate reason? Do we have a dichotomy at the heart of the drive towards more responsible research assessment between the needs of society to fund better research and the needs of researchers to be treated fairly? In the end, we decided that while these ideas are often treated as separate, they are, in fact, closely related.

We concluded that judging research, and by extension researchers, more fairly will inevitably lead to better, and sometimes more complete research. The obvious example of this connection is the way that metascientists, like Malcolm Macleod and Emily Sena at the University of Edinburgh, identify links between the success of clinical trials and the quality of the basic science on which the experimental approach is based. During our meeting, we identified other benefits for fairness in assessment. For example, fairer research assessment will likely lead to a more diverse range of researchers getting jobs and tenure at well-funded universities and, likely, better research. It would also likely lead to more global diversity in terms of research interests, and give more attention to the needs of the Global South. In short, there was a real sense that the way we assess research has embedded inequities that can at times even be colonial at their core. This leads to the loss of good ideas, a loss of diversity of perspectives, and a neglect of the needs of certain communities.

Research is changing, but assessment isn’t keeping up

Mission-oriented research, the impact agenda, the UN Sustainable Development Goals, even the original moonshot, are all examples of a global policy trend towards targeting research and its funding towards solving problems. The thing is, big, thorny, wicked problems (or whatever colorful adjective takes your fancy) don’t lend themselves to single individuals with a background in one, two, or even a handful of disciplines coming up with the masterstroke that puts an end to the issue. These challenges are more complex and require people from multiple disciplines spanning natural, physical, social sciences and the humanities coming together. Tackling climate change, for example, will require technological advances, social change, perhaps even new ways of conducting politics. Then there’s the COVID-19 pandemic and the broader need to improve our approach to infectious diseases so that we’re ready for the next threat. For that we’ll need to draw on immunology, virology, public health, computer science, and communications studies to understand how to limit the damage done by those working against any response.

Our group believed that current evaluation systems still emphasize individual impact to the point of discouraging collaboration. Some said it was baked into systems like tenure and grant awards processes. Anecdotally, and rather perversely, it seems like some of the most prestigious institutions discourage teamwork the most by overemphasizing singular achievement.

So what’s holding us back? Given the desire at the policy level to solve problems that rely on teams, why do we still cling to the notion of the singular research genius that can solve the problem on their own? Perhaps this is a question as complex and multi-faceted as the UN Sustainable Development Goals. A singular diagnosis is impossible. Some pointed to an overreliance on metrics, or the wrong sorts of metrics by funders and institutions alike. Others point to conservatism of senior researchers and risk aversion in academia. Some even pointed out that those who have benefited from the old system resist change because new rules might not favor them personally.

It’s easy to regress into finger-pointing, with each stakeholder group blaming the other, but perhaps we’re all both part of and subjected to a system that no one stakeholder controls: we’re all to blame and none of us are. Researchers are incentivized to behave a certain way because that’s how they get published, how they get funded, and how they get promoted. Institutions hire and promote those who are good at achieving on those terms, and funders look to publishing and employment track records to assess the quality of researchers. Meanwhile, publishers compete for prestigious articles in the context of the current system and respond to what researchers say are their immediate needs. The situation was likened to something like a generalized prisoner’s dilemma or tragedy of the commons. There’s a sort of dark network effect at work here that perpetuates toxic properties of the system by punishing those who stray from the norm.

Can collective action overcome toxic network effects?

Collective action and strong, inclusive governance approaches were put forward as means to create structural solutions — to try to change the rules of the game. There are already movements in that direction. Perhaps the best known is the San Francisco Declaration on Research Assessment, better known as DORA, which has been signed by nearly 21,000 individuals in 153 countries at the time of writing. DORA is a set of principles and recommendations that focus on the need to eliminate the use of journal-based metrics, assess research on its own merits and develop better measures of quality enabled by digital technology.

Other examples include the OPERAS research infrastructure and Knowledge Exchange’s work on the openness profile. There is also the Belmont Forum, a group of funders who support trans-disciplinary research that brings people together not only from different disciplines but also from the communities that the research aims to serve, a voice which was said to be missing too often from the conversation, especially from the Global South.

Despite these efforts, there was a strong feeling that more needs to be done not just within stakeholder groups but across them as well to put principles into action.

The need for leadership

Strong leadership by example was also recommended. One reason suggested for a lack of experimentation in research assessment is fear of being first to break from the norm, and the risk of getting it wrong, especially among institutions and funders that have very prestigious reputations and potentially a lot to lose. This is where courage is important. For change to begin, somebody must lead and for change to stick, others must follow and build on what has gone before.

Good examples of leadership by example include work done by UKRI and NWO on narrative CVs. Some institutions are being bold as well, for example the work being done at Technical University of Denmark on Open Science Research Profiles; TU Delft are pioneering a cooperative self-assessment approach to research quality assurance; and of course, the pioneering work at Leiden University. Complementing these efforts are innovations in infrastructure like the CREDiT taxonomy and RAiD, the research activity identifier from ARDC. There are more leadership examples listed on this page compiled by Rachel Miles at Virginia Tech.

So what’s next?

The round table was a refreshingly honest and engaging discussion on the issues surrounding responsible research assessment, but it was very much a starting point. Our goal for the event was to create both a safe space for discussion and a level of shared understanding of the fundamental issues. I believe we achieved both of those goals. We also reached a conclusion that there is a need for even more and stronger cross-stakeholder collective action to accelerate progress on this issue and strong leadership that builds on what has come before.

The challenge will be operationalizing those ambitions. From this starting point, there’s a need to generate a broader understanding of these problems and opportunities across all of our communities. We need an appropriate sense of urgency, given the scale of the sustainability and existential challenges that we face as a civilization. Perhaps we now have the beginnings of a guiding coalition to act as a catalyst for change. If we can find a way to grow that coalition, encourage and support bold leadership, new standards, strong governance and collective action we might finally be able to accelerate progress towards a more responsible way to assess research, a fairer way to treat researchers, and a better way to decide how to spend our limited resources for the maximum benefit to society and its people.

Acknowledgements

The Roundtable on Responsible Research Assessment for the 21st Century took a lot of planning, organization, and tireless preparation from all the members of the Publisher-Funder task force. Particular thanks go to event organizers Jennifer Griffiths of Springer Nature and Shannon Cobb of the Canadian Institutes of Health Research (CIHR). Thanks to the official scribes for the event Tushar Shakya of Canadian Science Publishing, and Lori Carlin of Delta Think.

Special thanks also to SSP program director Mary Beth Barilla for keeping us on track so that the event actually happened, to Sarah Viehbeck of CIHR for co-hosting alongside Jennifer. Also thanks to Adrian Stanley who as co-chair of the Publisher-Funder task force, voluntold me to join the group, and to SSP president Alice Meadows for her support and introducing the event.

Adrian, Jennifer, Mary Beth, Tushar, and Nick Campbell of Springer Nature all reviewed and provided input to this post. The opinions and ideas presented above are a synthesis of those attending the meeting and should not be attributed to any individual.

Phill Jones

@phillbjones

Phill Jones is a co-founder of MoreBrains Consulting Cooperative. MoreBrains works in open science, research infrastructure and publishing. As part of the MoreBrains team, Phill supports a diverse range of clients from funders to communities of practice, on a broad range of strategic and operational challenges. He's worked in a variety of senior and governance roles in editorial, outreach, scientometrics, product and technology at such places as JoVE, Digital Science, and Emerald. In a former life, he was a cross-disciplinary research scientist at the UK Atomic Energy Authority and Harvard Medical School.

Discussion

1 Thought on "How Do We Make Research Assessment More Responsible? – A Multi-stakeholder Discussion"

Thanks for this great write-up Phill. I feel incredibly lucky to have been able to join this discussion – publishers and funders have a lot of shared interests, but it’s not that often that they sit down together (albeit only virtually) to discuss them so meaningfully. This group was thoughtful, engaged, and focused on potential ways of working together to effect change. I hope this will be the first of many discussions – and my thanks to all who organized and participated.

By Alice Meadows
Feb 9, 2022, 10:11 AM

The Scholarly Kitchen

How Do We Make Research Assessment More Responsible? – A Multi-stakeholder Discussion

Is there even a problem?

Why should we care about the quality of research assessment?

Research is changing, but assessment isn’t keeping up

Can collective action overcome toxic network effects?

The need for leadership

So what’s next?

Acknowledgements

Phill Jones

Discussion

Innovation Showcase Highlights Cutting-Edge Publishing Solutions

View photos from the 46th Annual Meeting!

Is there even a problem?

Why should we care about the quality of research assessment?

Research is changing, but assessment isn’t keeping up

Can collective action overcome toxic network effects?

The need for leadership

So what’s next?

Acknowledgements

Phill Jones

Related Articles:

Next Article: