Some myths never die. They are propagated by those who see their rhetorical utility. Rarely does anyone stop to check their validity, since checking involves real work.
In a presentation at the 2009 Council of Science Editor’s Annual Meeting, Tim Ingoldsby from the American Institute of Physics takes on a myth that has been reiterated over and over again in the case for self-archiving: that everything published in physics can be found in the arXiv.
This myth is important, because it allows one to make the often-recited argument that self-archiving doesn’t hurt publishers.
In his presentation, entitled “Physics Journals and the arXiv: What is Myth and What is Reality?” Ingoldsby presented the first, large systematic study of article deposits in the arXiv.
Employing a summer intern, Ingoldsby conducted an arXiv search of nearly 5,000 journal articles published by the American Institute of Physics and the American Physical Society. Their methodology was painstakingly robust, looking for title variations and having all unsuccessful searches repeated by a trained physicist.
The percentage of articles found for each journal in their studied varied greatly. While fields such as elementary particle physics and astrophysics reported nearly 100% overlap, this finding was not generalized over other sub-disciplines in physics. Many fields showed much less coverage, many under 5%. As Ingoldsby writes:
Only for a narrow range of sub-fields, representing at most 15-20% of physics, can it be said that the arXiv provides comprehensive coverage
Moreover, Ingoldsby’s study questions the accuracy of article metadata in the arXiv. Less than one-in-three authors updated the metadata of their arXiv record with a full citation when the article was published. Since the title of a manuscript may change between submission and final publication in a journal, this makes it difficult for a reader to locate the version of record.
It would be naïve to believe that self-archiving has no effect on scholarly publishing. For some narrow sub-disciplines of physics, it has become part of the normal process of disseminating research findings. Generalizing the experiences of publishing in high-energy physics to the entire domain of physics is a tall order: Making a further generalization to all of scholarly publishing is even more amazing.
The myth that “everything published in physics is in the arXiv” needs to be replaced by more careful and limited statements of fact. Only then can predictions on the effect of self-archiving be made with confidence.