Could Francis Bacon have used a better author credit attribution system? From the Chicago Tribune, 1916.
Could Francis Bacon have used a better author credit attribution system? From the Chicago Tribune, 1916.

As science becomes more collaborative and more complex, there is a growing need to be more granular in the assignment of credit and work recognition to scholars. When researchers publish the results of their work, acknowledgement of contribution is normally done by attaching their name to a scholarly article as “author,” with the first author normally the primary contributor. This system of credit worked well enough in an environment when there were only a few authors on a paper. However, we have increasingly seen papers authored by teams of people numbering in the dozens or even hundreds. What does it mean to be one of a hundred authors? How can anyone outside the team know what any person on the list did to contribute to the effort of producing that result?

When author lists are that long, the list is frequently shortened in citations and elsewhere to just the first author when referencing an article—with the rest being consigned to the “et al.” in the citation—so credit is often assigned alphabetically rather than by level of contribution to the project. If, as I expect, science is going to continue to grow more collaborative, these problems will only expand.

Last week, an editorial in Nature highlighted the problem of the proliferating number of authors on papers. This an issue for the authors, who can receive short shrift for their contributions (or alternatively the appearance of a greater contribution than actually occurred), and also a concern for grant funders in reviewing application proposals and for university administrators who need to assess the scholars for promotion and tenure.

In 2012, a symposium in 2012 held at Harvard University, supported by the Wellcome Trust, brought together a diverse group of science editors, publishers, scholars ,and others to begin a discussion on alternative contributorship and attribution models. The meeting identified a number of challenges related to the current state of attribution, including variances in attribution across disciplines, the growth in co-authorship, a general lack of clarity regarding “authorship,” the ambiguity of roles in co-authorship, and the challenges of measuring output based on the current system.

The report from that meeting included a variety of recommendations. At the time, both ORCID and FundRef were just getting off the ground and were identified as critical components of an improved attribution system. Subsequent growth of both systems has shown that they will greatly improve the ability to track and assess in an automated fashion the scholarly output of researchers. The meeting report further suggested the generation of a taxonomy of creator roles and proposed a next phase that would seek to develop and test that taxonomy.

The follow-up work on developing a pilot taxonomy was undertaken by a small team led by Liz Allen (Wellcome Trust), Amy Brand (Digital Science), Jo Scott (Wellcome Turst), Micah Altman (MIT) and Marjorie Hlava (Access Innovations). After establishing a test taxonomy, the group undertook an online survey to test the author role definitions they developed. The results of the survey, described in the Nature Commentary, noted that of the 230 corresponding authors that responded to the survey, “82% of respondents reported that using the more-structured taxonomy of contributor roles presented to them was at least ‘the same’ as (37%) or ‘better’ (45%) in terms of accuracy than how the author contributions to their recently published paper had actually been recorded.”

Additional work on the authorship roles taxonomy is certainly needed, since a relatively small survey couldn’t capture the diversity and needs of all of scholarly communications. A system could be developed that improves attribution to better satisfy a larger percentage of the research community. Consideration will need to be made about how publishers and editors will incorporate this new taxonomy in their existing publisher workflows. Finally, the practical questions of who will assign contributor roles in the submission process will need to be worked out.

Further development work on the proposed taxonomy and the business processes that surround it is being considered as a joint effort by Consortia Advancing Standards in Research Administration Information (CASRAI) and NISO. The effort would include greater outreach and engagement with publishers, administrators, funding bodies, and more researchers. A second workshop sponsored by the Wellcome Trust on contributor roles is planned for the third quarter of 2014; but full details are not yet available.

In an increasingly complex research world, clearly identifying what role the various contributors had in that process is important. Each contributor should be appropriately credited and receive due recognition for their level and type of contribution. The community could benefit from carefully considering and applying more robust definitions for what being an “author” means in a scholarly context. The challenge will be to create a system that isn’t overly complex and whose benefits will over-ride any additional efforts of implementation.

Todd A Carpenter

Todd A Carpenter

Todd Carpenter is Executive Director of the National Information Standards Organization (NISO). He additionally serves in a number of leadership roles of a variety of organizations, including as Chair of the ISO Technical Subcommittee on Identification & Description (ISO TC46/SC9), founding partner of the Coalition for Seamless Access, Past President of FORCE11, Treasurer of the Book Industry Study Group (BISG), and a Director of the Foundation of the Baltimore County Public Library. He also previously served as Treasurer of SSP.

Discussion

18 Thoughts on "When a Scholar is One Among 500, What Does it Mean to be "An Author"?"

This sounds like a complex taxonomic solution to an accountability challenge that may not be all that important. The first problem with this sort of solution is the burden on authors of using it. Imagine the effort required, for the hundred author paper you refer to, for all these authors to agree on how each will be classified (if they can). It might be harder than writing the paper.

The second problem is that this solution only works if it is extensively adopted, by millions of authors, which may never happen. If adoption is relatively small then all the effort may be wasted. (FundRef use is growing very slowly.) Or are we talking about more mandates?

There is no doubt that implementing this would be challenging, which is why an approach that aims toward simplicity makes most sense.

There are a variety of community practices that if you thought about the challenges of having getting millions of researchers to do something, you would be frozen in a state of inactivity. Adoption might be important in only in those fields where collaborative authorship is expanding rapidly, such as the bio-sciences and physics. It might be less pressing in the humanities.

Reflecting on the FundRef adoption as “slow” at this stage seems to me to be an unrealistic expectation. The pilot report for FundRef was published just a year ago, and building in these services into publishing systems requires some development time (which is always scarce) at nearly any publisher. It appears that FundRef is picking up speed, looking at the FundRef deposit information: http://www.crossref.org/06members/fundrefdeposits.html

There are only about eight publishers making significant contributions to FundRef, after about a year, while CrossRef has over 4,000 members. Some contributions have diminished to almost nothing, especially Elsevier which put in a test group last November and practically nothing since then. This is not what I would call picking up speed.

There is nothing wrong with proposing new regulatory schemes but they have to be seen as such, and there are standard ways of evaluating them, especially simplicity, burden, cost and value. I do not know what existing practices you are referring to but I imagine they are obviously valuable. Can you give an example? Would citation per se be one?

I think you’re being a bit hasty on condemning FundRef. Journal submission and publication systems are complex and often need to deal with a large number of publications and publishers, each with different needs and standards. The list of “participating publishers” is perhaps misleading, as major hosting platforms (such as HighWire, home of over 1,700 journals and over 130 publishers) are in the process of implementing FundRef which should be completed in the near term. These things take time and there is great interest and support in FundRef from the publishing community.

I am not condemning FundRef, simply making an empirical observation. For that matter 1700 journals and 130 publishers is far from the scale needed to make FundRef data truly useful, except at the journal or publisher level. I am especially thinking about the US Public Access program and CHORUS, where incomplete data probably creates more problems than it solves. I call this the CHORUS conundrum. If an agency wants to find all the articles that flow from it’s funding then partial data from FundRef may not be all that helpful. Universality is a tough case.

Given the timeline for most agencies to implement public access plans (not expected in some cases for years), FundRef still has a good amount of time to continue to build support and data. I suspect that the near term will see a great deal of growth.

CHORUS should not think that it has all the time in the world to gear up (and I have told them so). As researchers know, Federal programs can shift gears suddenly. The Energy Department, which funds more physical science than any other agency, is chomping at their bit to launch a system. Once that happens the pressure can suddenly mount on the now slow agencies, via a Congressional inquiry for example. Time is of the essence.

I discuss some of this in a recent article in my newsletter Inside Public Access (behind a small pay-wall) titled “DOE Faces CHORUS Conundrum.” See http://insidepublicaccess.com/issues.html for a free synopsis.

I’m not sure this is an accurate assessment of the current situation, though I’m not at liberty to divulge further details. It should be noted though, that the White House has publicly stated that agency plans must include flexibility to avoid lock-in to one particular solution and to allow them to evolve over time as technologies change and different solutions prove more or less effective. From what I understand, there is little expectation that CHORUS (or any other potential solution) will provide 100% coverage of the entire literature at launch, and many agencies will likely adopt multi-pronged approaches, at least initially.

It is precisely the economics of a multi-pronged approach that I am concerned about. For example, if an agency first has to collect all of its accepted manuscripts in order to get completeness, then replace some of these using CHORUS in order to get some VORs, the latter becomes an extra expense. Nor is OSTP necessarily in charge in this regard, as Congress has the final word on how money is spent. More generally, once an agency develops the infrastructure to do things one way it becomes expensive to scrap that and go another way later. The situation with Public Access is very fluid at this point, not something that can be planned for.

I think the CHORUS FAQ say they will provide a complete solution for the agencies. They might want to change that.

It’s at the journal or publisher level that CHORUS is designed to function.

No Joe, what I meant was that a journal or publisher could use it to see who was funding its articles. But at the agency level it may have to be relatively complete across all journals and publishers in order to be really useful, say 90% complete. That is a tall order.

I’m late to the party here, but to set the record straight, CrossRef has1940 voting members that represent about 4600 publishers and 35,000 journal titles from 76 countries. CrossRef membership has a very long tail. Six members (.31%) have revenues of over $500 million, while 1670, a whopping 86%, have annual revenues under $1 million–often 0. The tail is equally long when we look at geographic distribution: 30% of our members are from the US, 30% are from 6 countries (the UK, Brazil, Turkey, Spain, Canada, and India), and the remaining 40% are from 64 additional countries.

I point this out to illustrate that even if only 8 publishers ever participate in FundRef, there will be a heck of a lot of funding data available about scholarly publications, and much more than is easily available now.

But there is no reason to think that FundRef has stopped growing. About 40 publishers from 8 countries have signed up so far (meaning they’ve agreed to the rules of participating in FundRef), and almost 200 publishers have deposited some FundRef data, indicating an interest high enough to warrant technical experimentation.The participants show a fairly even distribution by publisher size. Many of the long tail publishers represent individual journals in international academic departments that may not see a need for FundRef and may not receive papers with substantial external funding.

Participation in CHORUS has been an incentive for some (primarily US) publishers to move quickly to submit FundRef data to CrossRef, but it isn’t the only one. The FundRef Registry contains 6100 funders from private and public funding institutions throughout the world, including, but not limited to, US government agencies. For these publishers, implementing FundRef may be important and desirable, but it may not have the same immediacy and may be scheduled behind other development priorities.

Another issue worth pointing out is that unless CrossRef Member publishers add FundRef data to their backfile content in bulk (and several have done so), it will take time for the manuscripts that authors are submitting with FundRef data now to make their way through the peer review, production, and publication pipelines, even if all of the implementation issues were solved now and even if all publishers were on board. The funding agencies that are part of the FundRef working group understand this process.

I wonder if they considered something like the attribution system for movies. With that, we could quickly and easily know who directed and who starred but still not neglect the “best boy” and “dolly grip.”

Authorship is one of those topics that looks so simple when approached, yet so complex when one delves into the history, norms, and technical reasons why we have such a system. With regard to taxonomies, Liz Davenport described a classification and XML tagging schema in 2001. The technology was available to implement such a schema at that time; perhaps it would be interesting to find out what happened to this project. see (Davenport, E., and Cronin, B. 2001. Who dunnit? Metatags and hyperauthorship. JASIST 52:770-773. http://dx.doi.org/10.1002/asi.1123)

For anyone interested in reading book on the history and legal issues surrounding authorship, Mario Biagioli (who will be speaking at the Spring STM conference next week) and Peter Galison wrote a great treatise on the topic (see: Scientific Authorship: Credit and Intellectual Property in Science. (2003) M. Biagioli, and P. Galison, eds. NY: Routledge). On a practical level, it is not surprising that authorship has different meanings across disciplines, and these differences lead to various conventions in authorship. Coming up with a unified taxonomy that will fit “science” may be conceptually very difficult. Concluding an elaborate argument about authorship, ownership, rewards and responsibilities, Biagioli writes:

Scientific authorship is a misnomer, a historical vestige. It is not about legal rights, but about rewards. Similarly, scientific responsibility is not a legal category, but a set of relations among colleagues. As such, they cannot be conceptually unified under legal axioms. It makes sense, therefore, that scientific authorship, whatever shapes it might take in the future, will remain tied to specific disciplinary ecologies. (p.274)

Indeed, taxonomically it is a fascinating challenge, one I would be very interested in for that matter. My taxonomy of confusions might be useful in designing a taxonomy of contributions, which is basically what we are talking about. But from a standards or regulatory point of view the biggest problem is the learning effort that millions of authors would have to go through in order to implement it correctly throughout the community. Then too, making promotion or funding dependent on how (and by whom) one’s efforts are classified raises a host of problems.

Comments are closed.