The State of the Version of Record

The “version of record” is an organizing concept in scholarly publishing. It is by reference to that version that others are understood and it is the object of financial models, policies, and recognition and reward systems. At the same time, many of the core functions of academic publishing – in particular, registration and dissemination – are decoupling from the version of record. Scholarly publishers are also expanding their remit to encompass other article versions, as well as other research outputs, and efforts to systematically link together and track these into a “record of versions” are growing. Today, I provide a landscape scan of the state of the “version of record.”

Version of Record Defined

The version of record is defined both formally by industry organizations as well as colloquially. In everyday conversation with scholars, librarians, and publishers, we often hear the version of record identified as “the publisher PDF” or even just “the PDF.” While a researcher may be willing to accept a different version, such as a preprint, when they ask for the PDF, no librarian asks which version they mean. We immediately look to locate the publisher PDF. We can see this conceptualization operating nascently as well when someone tweets about an article and then tags it “[paywalled]” even though there are other versions of the article that are open.

Crossref’s document on Version Control, Corrections, and Retractions defines the version of record as the “typeset, copyedited, and published version” and observes that version control is important for traceability, identifiability, clarity, reduced duplication, and reduced errors. While there are many potential versions prior to version of record, there is only one version that follows sequentially.

NISO’s definition of the version of record further underscores the role of the publisher in the creation of the version of record. The NISO Recommended Practice on Journal Article Versions states that the version of record is “a fixed version of a journal article that has been made available by any organization that acts as a publisher by formally and exclusively declaring the article ‘published.’” Most notably, then, publishers have the agency in identifying the version of record. We know which is the version of record through a kind of “speech act” by the publisher – by declaring it the version of record it becomes the version of record.

We should also take a moment to address some potential confusions about the version of record. In the early days of electronic journals, when there were discrepancies between the printed version and the electronic version, the print version was typically considered the version of record in various style guides and the like. Today, the printed version of a journal issue – when one even exists at all – is no longer given this primacy.

Additionally, the version of record is also not always the most recent version of an article. NISO identifies two follow-on possibilities (corrected and enhanced). Crossref presents a single category for all follow-on possibilities (updated), including retraction.

Finally, in the case of publicly available or open access articles, there is some additional nuance. For open access, the version of record may be gold and/or green – as the “color” of open access is not determined by version but by whether the copy is served by the publisher platform or elsewhere.

The Centrality of the Version of Record

What we can see then is what I would term the teleological and central nature of the version of record in organizing the scholarly record. As the version of record comes into being, it has the effect of fixing the earlier versions as predecessors to it. In addition, anything that comes after the version of record is defined relative to it. This centrality is reflected throughout the scholarly communications system in a number of ways.

Editorially Privileged

The version of record is editorially privileged in many citation and style guides. For example, the APA Style Guide to Electronic References stated that authors should “Update your references frequently prior to publication of your work; refer to the final published version of sources when possible.” More recently, the APA Style Blog directed that authors should “Ideally, use and cite the final, published version of a work. However, if you used the preprint version of a work, cite that version.”

In addition, the instructions to authors for manuscript preparation may include admonitions to not cite other versions instead of the version of record. For example, even in physics, a discipline that led on the development of preprints, the American Physical Society states in its Information for Authors that “citations to e-print archives should not be used in place of primary references.”

Prioritized by Funders

Funders have focused on the version of record in their open access policies. Even when an author accepted manuscript is accepted for compliance instead of the version of record, the requirement to provide open copy is tied to the existence of the version of record and, when embargoes are allowed, to the timing of the publication of the version of record.

For example, the NSF Public Access Policy “requires that either the version of record or the final accepted manuscript in peer-reviewed scholarly journals and papers in juried conference proceedings or transactions (also known as “juried conference papers”) be deposited in a public access compliant repository designated by NSF; be available for download, reading and analysis free of charge no later than 12 months after initial publication.”

Some funders go further and express a preference for the open copy to be the version of record. For example, cOAlition S states that: “we are mindful that the AAM [author accepted manuscript] version differs from the VoR [version of record]. Not only does the latter contain all the changes from the copyediting process, journal formatting/branding etc., but it is also the version maintained and curated by the publisher, who has the responsibility to ensure that any corrections or retractions are applied in a timely and consistent way. For this reason, our preferred option is to ensure that the VoR is made open access.”

In addition to open access compliance, the version of record is also the locus of compliance for mandates related to ethics declarations, data availability, and funder recognition as well as documentation of contributor roles. Prior to the version of record, these elements are often either fluid and/or not required.

Focus of Recognition and Reward Structures

As a faculty member myself, I am very aware of the focus on publications in promotion and tenure processes. For example, at my institution, our departmental annual review directions state “do not include works submitted but not yet accepted” and the dossier outline for campus says to list “in print or accepted” works. Other research institutions have more expansive parameters for listing preprints or in progress works; however, I know of none that would judge positively a tenure case that included only such works. And, when I am asked by other universities to write external evaluator letters for candidates being considered for tenure or promotion, the focus requested by the university is on published work. In my more than two decades of writing such letters, sometimes as many as seven in a given year, I have never been sent a preprint to evaluate.

It is also notable that the research assessment reform effort DORA, while recognizing the expansion of research outputs, organizes its efforts around the version of record. Specifically, the San Francisco Declaration on Research Assessment states that “Outputs other than research articles will grow in importance in assessing research effectiveness in the future, but the peer-reviewed research paper will remain a central research output that informs research assessment. Our recommendations therefore focus primarily on practices relating to research articles published in peer-reviewed journals but can and should be extended by recognizing additional products, such as datasets, as important research outputs.”

And, of course, institutional rankings and related processes are also heavily dependent on publications and citations to those publications. For example, the THE World University Rankings take significant account of research productivity and citations as assessed through Elsevier’s indexing and bibliometric tools, which take as their object a corpus of published articles, books, and conference papers.

Researcher Perspectives

Unfortunately, there is relatively little systematic investigation into researcher perspectives on the version of record; nonetheless, the two significant reports we do have find a similar pattern – researchers value the version of record but also find value in other versions for particular use cases. There are also instances where they are unwilling, or at least quite hesitant, to substitute other versions. This may be, at least in part, due to the editorial, funder, etc. privileging discussed above; researcher preferences and actions are shaped by environmental factors not only personal beliefs.

Exploring Researcher Preference for the Version of Record, a white paper from Springer Nature by Mithu Lucraft, Katie Allin, and Imogen Batt, presents a study conducted in the context of the Springer Nature syndication partnership with ResearchGate. The study reported that researchers prefer to read and cite the article version of record, which they believe is easier to read and more reliable as well as the most authoritative and credible source. Specifically, the survey found that 83% of respondents preferred working with the version of record for citing content in their own work. In contrast, 9% preferred author manuscripts and 2% preferred preprints. Open text comments evidence that researchers value the “stamp of credibility” that they believe is signaled by the version of record. Nonetheless, researchers do see value in other versions, which are particularly valuable for skimming, reading, and keeping up on the literature. Speed and ease of access are seen as a significant benefit of versions that are publicly available, particularly of preprints.

The study also reported that researchers are more likely to look for ways to find the published article than an author manuscript or preprint, including by attempting to contact the author directly. As I have noted previously, the open data set that accompanies the report also includes multiple mentions of Sci-Hub as a mechanism for getting copies of published articles, which underscores the drive to access the published copy, though this was not analyzed in the formal report.

What’s the Big Deal? – an inquiry into the emergence, growth, and decline of library subscriptions to large scale journal collections by Danielle Cooper and Oya Y. Rieger of Ithaka S+R – uncovered some important details of researcher perspectives on the version of record and other versions through examination of researchers’ experiences in the context of significant journal subscription cancellations.

Most significantly, respondents were clear that they prefer not to use preprints for most teaching purposes and discourage students from using preprints in their papers. These instructors comment on the difficulty students have with respect to understanding the context of a preprint. Specifically, the report shares that researchers find that “preprints add an extra level of complexity to the already challenging process of introducing students to the scholarly communications ecosystem. With the need to assign a limited number of articles that correspond to specific pedagogical aims, instructors therefore generally prefer to opt for those that they perceive as the easiest to navigate as a novice reader, which are peer reviewed and published in respectable journals.” The report continues on, “Given that students are at an earlier point of experience with evaluating scholarly literature, the version of record was seen by our interviewees as the easier format for them to navigate.”

Researchers themselves also sometimes struggled to conceptualize the role of preprints, though this appeared at least somewhat influenced by disciplinary context. Cooper and Rieger observe that “Some scholars who did not come from fields with an established preprint culture also expressed confusion as to how to engage with preprints meaningfully … In fields where preprint practices are less established, interviewees frequently questioned their quality and expressed their ongoing commitment to the peer review process. There was also confusion among this group as to how a preprint relates to or can be linked to the version of record.”

In fields where preprints are more firmly established, researchers seem to have developed more nuanced perspectives, reflecting different use contexts even within the single use case of citation. This quote from a researcher included in the Ithaka report is illustrative: “I use preprints … primarily to stay up to date with the latest and greatest in my field. And that’s primarily directed towards when I’m working on my grants. Less so when I’m writing papers.”

Additional studies on this topic of researcher perceptions and practices are greatly needed. These two offer just a glimpse into scholars’ mindsets and approaches, which one suspects are evolving relatively rapidly, at least in some fields. Understanding how scholars conceptualize the role of the version of record and its relative value would be of great use to all players in the scholarly communications industry.

Financial Sustainability

This review of the state of the version of record would be greatly remiss if it overlooked the centrality of the version of record to the financial sustainability of the scholarly publishing industry. Subscriptions are payments for access to the version of record. Subscribe-to-open is a payment for assured access to the version of record. APCs and other open access funding frameworks (e.g., transformative agreements such as read-and-publish or pure publish agreements) operate by paying for the production of the version of record.

Of course, this financial model is threatened as components of the value proposition of the version of record decouple from it. Preprints are carving out the value of the role of the version of record in registration; public posting of preprints and author accepted manuscripts are carving out the value of the role of the version of record in dissemination. Certification and preservation remain tied relatively strongly with the version of record; however, in some fields being posted on a specific preprint platform shows glimmers of threatening the role of certification. Perhaps these value challenges are why we see an increasing turn to payment-to-publish rather than payment-to-read revenue models as well as publishers pivoting to workflow support and hosting preprints, data, etc. directly. Publishers are innovating in order to attempt to shore up their centrality.

The Future of the Version of Record

The version of record’s role as an organizing principle in scholarly communications is so deeply inherent in the system that dislodging it from that centrality would be a significant challenge and very destabilizing to the system as a whole. One can imagine a future in which research projects generate a document (or a set of documents) that are continuously revised and updated, perhaps over the course of multiple researchers’ careers, with evolving authorship and contribution markers. In such a future, perhaps there is with no identified exclusive version of record, only a set of the most recent versions, and perhaps they are selected into overlay journals, with a given text possibly selected into more than one journal. It seems more likely that the version of record will be the locus of future innovations as the fixed nature of a version of record has proven utility – beyond its financial value – as an infrastructure of knowledge building, information, and recognition and reward systems.

Note: This post is adapted and updated from my plenary talk at the APE 2022 Conference: The Future of the Permanent Record.

Lisa Janicke Hinchliffe

@lisalibrarian

Lisa Janicke Hinchliffe is Professor/Coordinator for Research Professional Development in the University Library and affiliate faculty in the School of Information Sciences, European Union Center, and Center for Global Studies at the University of Illinois at Urbana-Champaign. lisahinchliffe.com

Discussion

35 Thoughts on "The State of the Version of Record"

Congratulations Lisa on a wonderful piece of work. What you write fits in well with what early career researchers are telling us though in this grant we have not specifically asked them about the VOR [ciber-research.com/harbingers-2]
Anthony

By Anthony Watkinson
Feb 14, 2022, 5:44 AM

Fantastic post, Lisa! This chimes well with what we hear more and more from Crossref members and metadata users too. In what we’re dubbing the “Research Nexus”, we’re going to evolve from the rather strict ‘content type’ approach we’ve had historically to something much more organic and less container-led. People like ResearchEquals and Octopus are really pursuing this by publishing objects such as “idea”, “narrative”, “script”, and “outcome”. We’re also going to need to focus less on the objects themselves (text or otherwise) and more on identifying the relationships between the objects (such as “version”, “correction”, “component”, “translation” and even in future capturing “reproduction” or “refutation”). We’re actively developing a new technical infrastructure and metadata schema (item tree) that will support a less container(journal)-centric system of metadata exchange for our community. Your post here will definitely help us all think this through even more thoroughly, so thank you!

BTW the versions page on our site you link to needs revising and updating – thanks also for the prompt to look into that.

By Ginny Hendricks, Crossref
Feb 14, 2022, 7:52 AM

It will be interesting to track how this evolves. Will we apply the “version of record” concept to these micro-content pieces individually? Or, will we see them as upstream from the formally published journal article, similar to how as we currently conceptualize preprints?

By Lisa Janicke Hinchliffe
Feb 14, 2022, 8:41 AM

An excellent post, thanks! A couple of situations in which the VoR is ambiguous:
1. a publisher participating in LOCKSS and Portico for preservation decides to abandon a journal, thus causing it to “light up” in both of those archives. Which is the VoR?
2. JSTOR and BioOne both provide full text journals that in some cases are also on the original publisher’s website (or whatever publisher owns them now). They have even registered their versions with CrossRef, resulting in those articles having not one but two DOIs for the identical citation. Someone using CrossRef to determine where the VoR is will not be able to distinguish unless they have “insider baseball” knowledge of what JSTOR and BioOne are.

In both of these cases, it theoretically won’t matter as long as the multiple copies are identical, but things can go wrong with servers over time (librarians think about preservation in decades or longer) and if the day comes that the two copies don’t match, what then?

By Melissa Belvadi
Feb 14, 2022, 8:06 AM

I think the NISO RP addresses this rather well — distinguishing between copies and versions: “The VoR may exist in more than one location (e.g., a publisher’s website, an aggregator site, and one or more repositories). That is, there may be more than one copy of a VoR but there is only one version of a VoR.”

By Lisa Janicke Hinchliffe
Feb 14, 2022, 8:32 AM

I do not quibble with “Most notably, then, publishers have the agency in identifying the version of record. We know which is the version of record through a kind of “speech act” by the publisher – by declaring it the version of record it becomes the version of record.” That is indeed the publisher’s prerogative. But, this does not permit a publisher to “formally and exclusively” declare an article as “published.” Something that is made public has been “published” whatever form it takes. Thus, the term “preprint” is merely the shortened form of “preprint publication,” and a comment on a preprint publication constitutes “post-publication review.”

By Donald Forsdyke
Feb 14, 2022, 8:20 AM

Interestingly only in that sentence in the NISO RP is “and exclusively” used. Everywhere else it just says “formally.” NISO has an active group charged with a revision of JAV (https://www.niso.org/standards-committees/jav-revision) — it will be interesting to see what revisions are made.

By Lisa Janicke Hinchliffe
Feb 14, 2022, 8:38 AM

Thank you for noting that the NISO Journal Article Versions Recommended Practice is being revised. A number of people for years had suggested that the JAV be updated, but it wasn’t until very recently that motivation by a team of volunteers coalesced enough to address the significant changes in the distribution of content.

By Todd Carpenter (NISO)
Feb 14, 2022, 10:06 AM

Thank you – really useful

By Monica Morrison
Feb 14, 2022, 10:58 AM

It would be amusing to rewrite this piece with a focus on scriptoria and manuscripts. What would then appear clearly is the desire to maintain the status quo. The first sentence of the “future of the version of record” demonstrates this desire very strongly.

When print came along, the certification system of scriptoria went out the window. We are now really entering the digital age, and the certification customs of print are going to be just as fundamentally challenged.

The first ones to understand this shift were computer programmers, and they instituted a record of versions. F1000 Research is doing the same. The “version of record”, if it has any sense at all, should limit itself to offering a temporary, relatively stable, foundation for an on-going conversation that never stops. I have called that “crystals of knowledge” in one of my papers.

By Jean-Claude Guédon
Feb 14, 2022, 11:09 AM

Interesting to think about. I notice though that even F1000 has the concept of “new article.” And, while an atypical implementation of the notion of the version of record in that it is pre-peer review, that “new article” is the the published version from which all others are referenced.

By Lisa Janicke Hinchliffe
Feb 14, 2022, 11:32 AM

If I am not mistaken, the “new article” is a new problem thread, so to speak, and quite different from a version of record. About the versioning system, the thread about peer review, led by the regretted Jon Tennant (https://f1000research.com/articles/6-1151), provides a lot of discussion and performative examples of a versioning system propelled forward by a commenting system.

The “new article” differs from a version of record by the simple fact that it is an anchoring point for a line of discussion, comments, emendations, corrections. Not only does the “new article” evolve with time, but versions are referenced as well. The very notion of a “fixed” article is subverted here, and in a most interesting way. Of course, it echoes the versioning practises of computer programmers.
If F1000 Research were non-commercial, and without APCs, it would be the closest example I could imagine of an ideal platform for scholarly publishing. The publishing process would espouse the communication needs of scholars.

By Jean-Claude Guédon
Feb 14, 2022, 12:03 PM

A rose by any other name is still a rose.

A new article on F1000 meets all the criteria in the NISO definition of VoR — a new article is a fixed version, made available by an organization that acts as a publisher (F1000), which formally declares the article published. F1000 makes this very clear, e.g., “Are all articles in F1000Research ‘published’, even if they have not been peer reviewed?” “Yes, all articles are published irrespective of the peer review status. “Peer review” and “publication” are two independent concepts. Most journals peer review and then publish; we publish and then peer review.” (https://f1000research.com/faqs).

Also, any follow on versions are labeled “revised” or “updated” — i.e., they take their status from the relationship to the new article, which also signals that a “new” article is the version of record qua the NISO definition.

An additional observation that I had in an early draft of this SK post: Having a “version of record” is not incompatible with having a “record of versions.” These are not opposites of each other.

By Lisa Janicke Hinchliffe
Feb 14, 2022, 1:21 PM

A rose may have several names indeed, but a single name does not guarantee the unique and fixed character of the entity to which it refers. The “air” of pre-Lavoisier chemistry differs considerably from our notion of air. Journals in 1665 were not like journals in 1850, in 1935, in 1990 and in 2022. The same is true of the noun “article”.

The Tennant reference I sent in my previous comment shows that three versions correspond to this particular “new article”. Nothing indicates that the third version is the ultimate one. Versions are the symptoms of what a “new article” (in F1000 Research) really is: not a static object, but rather a process.

If versions referring to a “new article” multiply, how can their existence be reconciled with the requirement for article fixity? Publishing refers to making something public, not fixed. So, my point is not about publishing’s relationship to peer review; it is about publishing and “fixed version”. If a new versions is labelled as an “updated” or a “revised” article, it looks to me as if the fixity of the “new article” is under deep stress (and I welcome this state of affairs).

To me the issue of fixity (and of version of record) is a consequence of the batch-production principle behind print and printed materials (even though print is not nearly as fixed as it implicitly claims). But, in 2022, we should probably look a bit more closely at what digital publishing entails.

I completely agree with your last paragraph. In fact, with two colleagues, we are working on a paper which, among other things, makes this very point. This paper may even take the form of a version of record… 🙂

In conclusion, let me reiterate that my point is about questioning fixity. Full stop. F1000 Research articles as they are published are open to constant revisions, corrections, and even refutations. They are not designed to be fixed. And I suspect that Vitek Tracz, the designer of F1000 Research, would heartily agree with me.

By Jean-Claude Guédon
Feb 14, 2022, 4:11 PM

The idea that a “fixed” VoR is not an “anchoring point for a line of discussion, comments, emendations, corrections” is not accurate. Indeed, science has evolved ways of encouraging all of these activities for hundreds of years–conferences, letters, commentary, review papers, retractions, special issues to discuss a particular controversy–all of these methods already exist and have for decades or centuries. In fact, the VoR itself acts as an anchoring point, and giving up the concept entirely would mean many boats drifting aimlessly on the temporal winds. I have done enough coding to know that records of versions contain numerous errors, forks, bugs, recursive loops, and redundancies. Treating biomedical research, for example, like computer coding could lead to very perverse and dangerous outcomes, particularly when patient treatments are based on the medical literature. To me, the 18% differential between COVID preprints and published papers is an enormously scary number, and one that virtually any physician would find unacceptable. That doesn’t even account for preprints never published and the large number of preprints that have proven to be either wrong or severely flawed because of bad data and/or significant sampling errors. This is even a problem in the published COVID literature, but it’s far worse among preprints. Different fields have evolved patterns and practices over time that work well for the field, as the comments here on SSH point out. Published case studies are used and valued in medicine, whereas preprints in physics have worked for decades. The idea that all fields should function in the same fashion and emulate computer science seems to be utopian and unworkable. Physicians have a huge challenges in keeping up with the literature as it is–if every paper becomes a perpetual motion machine with no fixed conclusions, how will they be able to handle that flow? I think the case studies are a valuable format for working physicians, one reason why they are used so extensively in medical schools to teach students and residents. Before leaping headfirst into new waters, it’s important to know how deep they are and how treacherous the currents may be. It doesn’t make sense to me to discard practices that have evolved in communities of practice over decades simply because a new idea might work for some of fields of research. If indeed the “living article” concept is useful, I would expect to see a variety of fields adopt this method and prove its worth over time.

By Daniel Calto
Feb 15, 2022, 6:11 PM

On the contrary, anything acting like a VoR is an anchoring point for further discussions. In a published discussion with a Danish colleague, I recast this semi-fixed VoR as a”crystal of knowledge” to underscore the hard stability of a platform that can help launch a discussion thread, while reminding readers that crystals themselves can grow.
The designing of a goof record of versions is indeed an important problem, and it is also a thorny one, but that is not a very good argument against it. Going to the moon was also difficult.
Record of versions should probably be tweaked for each large knowledge sectors. No one is speaking about a one-size-fits-all solution here, and, in passing the monotonous solution of journals with articles looks pretty monotonous to me.
In conclusion, I fully agree with David Calto’s last sentence.

By Jean-Claude Guédon
Feb 16, 2022, 8:52 AM

The Economist reported (5th February) a UK study concluding: “The case for publishing in expensive, restrictive scientific journals continues to weaken” It went on “A team of researchers led by Jonathon Coates, a biologist at Queen Mary University in London, decided to analyse how reliable preprints were early in the covid-19 pandemic. They compiled a set of 184 research papers in the life sciences that had initially been posted as preprints on bio Rxiv and medRxiv—two large preprint servers—and later published in one of 23 major scientific journals in the first four months of the pandemic. They compared each preprint with its more polished version that had later appeared in a journal. They looked for content that had been added or removed from the body of the manuscript, tables or figures that had been rearranged, and when key wording had been changed.

Dr Coates’s analysis found that 82.8% of coronavirus-related preprints and 92.8% of non-coronavirus-related preprints saw no material change to their conclusions upon journal publication.
https://www.economist.com/science-and-technology/preprints-on-the-coronavirus-have-been-impressively-reliable/21807492

By Ken Chad
Feb 14, 2022, 1:02 PM

Worth noting that it’s since been pointed out by Kent Anderson and also by myself that some of the manuscripts in this analysis, it turns out, were submitted to and accepted at journals *before* they were posted at preprint servers. (In these cases, naturally, there would be no change between journal publication and preprint). The extent to which this affects the analysis isn’t clear, as the authors haven’t yet thoroughly addressed the critique. Ironically, this point doesn’t seem to have been picked up in peer-review.

Also, this research may not tell us much about the intrinsic reliability of preprints, despite the Economist’s characterization of the study. That is because it selects for examination only those preprints that are published in journals. Many other preprints don’t make it into journals, and a contributing factor may be that they are less reliable than those that do.

By Richard Van Noorden
Feb 15, 2022, 9:38 AM

Indeed, there is no requirement that the version of record be substantially different in content from prior versions in order to be the version of record. In fact, we would really expect there to be only minor differences — and certainly not a change in conclusions! — between the author accepted manuscript and the version of record.

Side note: If I’m honest, I’m concerned that this study found so many pieces with material change in their conclusions. These data mean that 17.2% of coronavirus-related preprints and 7.2% of non-coronavirus-related preprints present information that is known to have been corrected in uncorrected form to the public. That means there is an almost 1 in 5 chance that a coronavirus-related preprint is presenting information that was later corrected with that correction not reflected in the preprint.

By Lisa Janicke Hinchliffe
Feb 14, 2022, 1:32 PM

> More recently, the APA Style Blog directed that authors should “Ideally, use and cite the final, published version of a work. However, if you used the preprint version of a work, cite that version.”

the way the concept of VoR is woven into the concept of citation is worth exploring further. resolving the tension between those two concepts seems to be a prerequisite for any significant VoR-related innovation.

as you note in the conclusion of the post, and as Jean-Claude Guédon suggests in comments above, a constantly evolving VoR is an interesting and sensible proposition in digitally mediated communication. a “record of versions” as you say, with the VoR simply being the latest version — akin to how wikipedia exposes a history of changes to VoR (e.g. https://en.wikipedia.org/w/index.php?title=Version_of_record&action=history) with a permalink to each of the previous versions. this sort of functionality would undoubtedly be useful in scholarly communication as well.

but this is where the concept of citation bares its teeth and we hit a wall — how would citations to all the different versions preceding the current VoR be counted? simply bundling the citation counts for each version within the record of versions seems unsuitable — each version is a unique document after all, even if only slightly different than its subsequent or preceding version. and yet if each version is treated as a unique and separate citable item, and citations remain to be separately tallied for each version, then that’s pretty much how it works now — we’d just end up with an order of magnitude more VoRs and a much harder time differentiating between them. not to mention the mess this would cause in bibliometrics, researchers’ bibliographies, etc.

so yeah, ultimately i’d agree that we’re unlikely to see the concept of VoR dislodged or modified in any significant manner — at least not within the currently prevailing system of scholarly communication.

By saša marcan
Feb 15, 2022, 4:25 AM

Many thanks for these remarks. At the same time, they reflect the fact that the print – syncopated – pace of publication is still very much in everybody’s mind. Once again, think about computer pr0ograsmmers: they do not worry about citing; they worry about working, programming, adding to what has been done before. Obviously, they will work from the latest available version, and , in so doing, will help create the following version. What is important here is to find an orferly way of evolving versions and keeping track of them. Incidentally, in programming, the phenomenon of “forking” happens. The same could be true of scholarship, but that is another question.

By Jean-Claude Guédon
Feb 15, 2022, 10:04 AM

Programmers don’t always work from the latest available version. They are even known to rollback to a previous version if things get really messed up! But, let’s leave that aside … yes, there are different practices in different communities and contexts. Journalists copy/paste from press releases without citation. Lawyers copy/paste from legal briefs without citation. I write letters, texts, etc. that I give to my Dean and other administrators for their use to draft reports to campus, nominate colleagues for awards, etc. without attribution to me. So, could scholarly practices change? Yes! Does it seem likely from the current state of the version of record and its centrality to so many infrastructures, expectations, systems, etc.? My assessment is no. But, let’s see what comes. Jean-Claude, I’m sure we’d love to have an SK guest piece from you on what you think should happen and what it would take to make it so! (Hint, hint, nudge, nudge!)

By Lisa Janicke Hinchliffe
Feb 15, 2022, 10:20 AM

Indeed, programmers do that, and scholars too. And they leave traces of their contributions in the readable code, if only because of accountability and responsibility issues.
When looking at the future of scholarly communication and publishing, one can adopt either of two attitudes: a) a short range one that nudges the present and thinks within the present’s box; b) a longer-range one that tries to think about the affordances of the technologies, the social forces at work, the various inequities and other scandalous situations that presently prevail, etc. And then, the work is to begin the tracing of a route that aims at correcting all that.
As some of you may have noticed, I tend to bend in the direction of b), if only because I am angry and scandalized by what I observe in the present publishing system. But also because, with our new technologies, we could do so much better.

As for the invitation to contribute to SK, many thanks. This is the second time that I am so invited. However, I believe I contribute enough by reacting to some of the theses that seem to circulate all too easily in SK circles. I simply do not navigate in those circles except to mark my disagreements.

By Jean-Claude Guédon
Feb 15, 2022, 11:36 AM

i believe we’re on the same page here. i definitely see pull requests and forks supplanting citations in the long term, especially considering how ubiquitous coding is becoming in all disciplines.

it’s just that i don’t see it happening in my lifetime. the concept of citation that emerged in 1665 has been going strong for 300+ years — print itself for 500+ years — and its heritage and influence won’t be easily superseded no matter how reasonable and beneficial it may be for science itself. it’s simply too deeply ingrained in human culture to change in a generation or two, likely even more than that considering how profitable the current state of affairs and maintaining that status quo is for incumbent VoR peddlers.

By saša marcan
Feb 15, 2022, 5:14 PM

I fail to see the connection between citations and forking. I suppose if a new paper had a single reference to an original work for which it intends to evolve that research – then yes, I think you can draw that parallel. However, papers tend to draw from multiple references.

By David
Feb 18, 2022, 9:48 AM

This is an excellent synopsis and exploration of a vital issue. From a Humanities & Social Sciences (History) perspective, 3 points are especially worth considering: 1) For the academic researcher/author, the VoR is often integral to research integrity and reproducibility: its “fixed” wording, references and pagination allow authors X, Y and Z to engage with author A’s article of book with confidence that they are all reading and responding to precisely the same text. Historians (for example) often cite the work of other researchers (as well as primary texts that underpin their analyses) word-for-word to build their arguments; their references/footnotes (of which there may be over 1,000 for a major monograph) refer to specific pages in author A’s text, allowing arguments to be checked for accuracy. The AAM is conspicuously less useful in this academic context than the VOR: its wording may be different due to copy-editing, its references may have been slimmed down, consolidated etc., and its pagination is fundamentally different. 2) We–all of us in H&SS–need to think more carefully about the consequences of differential access to the VoR across the H&SS scholarly landscape are for equality, diversity and inclusion. Many academic presses have participated in charitable programmes that deliver free journal subscriptions (and thus VoRs) to, for example, universities in the Global South. If these are supplanted by ‘green’ AAMs, what will the impact on scholars in these institutions be? 3) It was great to see teaching addressed in this article! Helping students navigate the thicket of different referencing styles is already a challenge; multiple versionings add further levels of complexity. Many programmes rely on Turnitin to help detect plagiarism in student work: what do we know about such tools’ ability to manage multiple versions? Students are the researchers of tomorrow, and pedagogy needs to expand to include attention to these new frontiers of academic publishing.

By Margot Finn (UCL)
Feb 15, 2022, 9:13 AM

As a history-trained scholar myself, I fully subscribe to Margot Finn’s description of the work of scholars in the HSS fields. This said, when you quote from the 3rd edition of a particular book, rather the 2nd or the 4th, you simply refer to various, semi-stabilised, VORs that have been set up within a clear Record of Versions. This is where I perfectly agree with Lisa Hinchliffe’s remark that VORs and ROVs are not incompatible. In fact, a Record of Versions calls for well-defined versions that are well recorded, and which, temporarily, may even play the role of VORs. In the print world, this plays out in a very staccato manner, but in the digital world, the outcome may be a lot more fluid.
With regard to the equity issue in poor countries or poorer institutions, one must evaluate whether having access to an AAM is worse than no access at all. Charitable programmes, so-called, do not cover the needs.

By Jean-Claude Guédon
Feb 15, 2022, 10:17 AM

Such a great example. Book publishing is really lagging behind in public sharing of versions. I personally would argue that each edition is a version of record (tho there are problems with extending a framework that was intentionally scoped to journal articles to other publication types). We have a version of record of edition 1, a version of record of edition 2, etc. Previous to each version of record edition are draft, copy pre-peer review, copy post peer review, an AAM, etc. I myself have published a chapter in a book that had an open, public review process so in that case there was a preprint (though it was taken down after the review period) — so I know it does happen that these versions are public but not typically. Will we see the kind of making public with monographs that we do with articles? It’s an interesting question that I’ve been watching for some time and will continue to do so.

FWIW, in general I think access to preprints and AAMs are typically better than no access at all but that in a just and equitable world everyone has the VoR (we just need to also figure out just and equitable means to provide such). But, if a preprint or AAM is erroneous, uncorrected, etc. – in substantial ways (as in the 18%+ of COVID preprints a study cited in another comment found), then in such cases that isn’t better than no access at all. Tricky to get a system set up that works!

By Lisa Janicke Hinchliffe
Feb 15, 2022, 11:00 AM

Thanks for this article Lisa. I would like to highlight that ‘version of record’ matters in terms of the ‘record’ – that is, for citation purposes. There is an argument that we wish to keep the academic discourse ‘clean’ and not confused by different versions (while acknowledging the argument for a Record of Versions). But the ‘record’ represents a very small proportion of the actual readership of a work. Analysis I have done over five years of readership versus citation of works of a single publisher showed the citations represented only 0.5% of the readership. (This was work undertaken internally for a project that was not published – unfortunately in retrospect.) The pattern of downloads in that instance clearly showed the readership was by students. I suggest that in the case of students, the Authors Accepted Manuscript – which is peer reviewed and corrected but may have not been copy edited – is adequate. So perhaps we need to put the ‘Version of Record’ into perspective? It is nomenclature which is being weaponised against green Open Access.

By Dr Danny Kingsley
Feb 15, 2022, 5:26 PM

Interesting observation Danny. I think this might be true re students if AAM were well labeled, but given they are often not and are on platforms with preprints (not yet reviewed), I am not surprised that the instructors in the Ithaka study are adamant that they want students to use the version of record.

By Lisa Janicke Hinchliffe
Feb 15, 2022, 5:39 PM

Excellent article, clarifying some of the issues around the version of record. Just one comment, which I guess is obvious, but worth thinking about. A PDF is “dumb” in the sense that it has no concept of anything that happens subsequent to its creation. A DOI, by contrast, can link to an updated version of an article that includes corrections or retractions.

If I ask a librarian for “the PDF”, how do can I be sure that PDF will include any retractions or corrections to the article post-publication? Once a PDF is created and distributed, for example to an institutional repository, an outdate version may continue to be available, and there is no way of being sure it really is the “version of record”. We have accepted PDFs for so long that we start to ignore their (many) limitations. In a real digital universe, we should be sure every time we consult the VOR it really is the latest available version.

By Michael Upshall
Feb 16, 2022, 7:40 AM

Nb: in practice, publishers assign different DOIs to a retraction or correction statement, rather than using the same DOI and explicitly adding the retraction note on top. And (amazingly) in some cases one can reach the original article and not even see a note that the article has been retracted or corrected.

Also, the Crossmark system was supposed to solve this issue of ‘am I looking at the latest updated version’. But in my experience it often does not show updates that I know exist.

By Richard Van Noorden
Feb 16, 2022, 8:23 AM

Well, in reality, you won’t even get a PDF in all cases … not all publishers produce them. I don’t think DOIs are really “smart” either. But, a DOI could be used by some “smart” technology to link one to other versions (upstream or downstream, when there are such). That is the promise of linked data — not yet realized as others have pointed out — but significant progress has been made in recent years and there are a lot of interests in having that happen.

By Lisa Janicke Hinchliffe
Feb 16, 2022, 9:28 AM

What Lisa Hinchliffe is pointing out here (and I fully agree with it) is that we do not yet know what form digital publishing will eventually stabilise around. For the moment, it is largely concentrated on emulating the print world, which qualifies this phase as that of of the “digital incunables” (thank you G. Crane). Most of the objections to a fluid, living-document, perspective stem from the difficulty to think out of the print box. But work is indeed being done. Think about Herbert Van de Sompel’s concepts, for example.

By Jean-Claude Guédon
Feb 16, 2022, 9:42 AM

I meant “incunabula! … 🙁 The French came through despite my best efforts… Sorry about that.

By Jean-Claude Guédon
Feb 16, 2022, 9:43 AM

The Scholarly Kitchen

Version of Record Defined