The most important technology in the world of online dissemination of scholarly materials today is the Hewlett-Packard printer.
For users of Canon, Lexmark, or printers of any other origin, worry not — the HP printer is here merely symbolic.
What it symbolizes is that, for the most part, online scholarly communications is not digital and, paradoxically, it’s not even online.
Welcome to the world of Web 0.5, where the Web meets the printer. For all the talk of Web 2.0 and beyond, Web 0.5 is where the academy is for the most part today.
These thoughts are prompted by the recent announcement that Google is now making about one million public domain books available in the ePub format (in addition to PDF). ePub is an evolving standard with a lot of publisher support behind it, and it speaks to the goal of a publisher to create an electronic document in a “neutral” format, which can then run on any viewing device. Publishers could save a great deal of money if ePub were to prevail, and users would have the benefit of not being locked in to a particular format or device. For example, at this time the Amazon Kindle uses a proprietary format, which gives Amazon enormous influence on how digital books are bought and sold. Not surprisingly, publishers would like to check Amazon’s enormous and growing power.
Let’s think about how we use online material. We begin a search for something in any number of ways: a link or citation in an article we are reading, a comment from a colleague, a Google search, a reference in Twitter, and so on. We find the article online and skim the abstract. If it looks interesting, we download the article and print it out. It’s awkward to read online for multiple reasons; hence, the importance of printing.
With new e-reader technologies, however, the printer is mostly bypassed, as online text is now disseminated to be viewed digitally.
Google’s announcement is yet another step in the migration or digital reading from PCs, whether desktops or, more likely, laptops, to other devices — netbooks, iPhone, iPods, mobile phones, and a growing number of dedicated ebook readers. (As someone remarked on a Twitter feed I saw recently: “Another day, another ebook reader.”) This creates some problems for publishers. First, which of these many devices should a publisher support? Second, what about all the money I’ve sunk into creating a database of books and articles in PDF format? Will I be able to use them?
Different users will have different requirements, but my view is that PDFs are not suitable for sustained online reading. They are good for finding things (they can be indexed by Google, for example) and skimming abstracts, and they can conveniently be mounted on a Web server. But as a reading format, PDFs leave much to be desired — they are hard to fit to a screen; if their size is altered, the text looks grainy; and some people find them sluggish to use (slow to load, etc.).
All these problems with PCs are multiplied when a user switches to a handheld device. Even netbooks, the fastest growing category of computing device, with screens much larger than mobile phones, display PDFs poorly. It’s odd to consider the fact that it is easier to watch a feature film on an iPhone than it is to read a scholarly article in PDF.
The challenge for publishers of scholarly materials is that all that work, all that investment, in creating forests of PDFs, is now going to have to be supplemented with new investments in retrofitting the scholarly archive. Publishers who are already working in XML can breathe easier, but the amount of material that simply won’t comfortably and inexpensively migrate to the new generation of reading devices is very large. And it is growing, to judge from the plans many people have of going even deeper into PDF production.
The point here is not the limitations of PDF, an outstanding technology for certain uses. It has a long life ahead of it. The point is that academic publishers may not be forecasting the need for ongoing investment in their materials. I call this phenomenon “once and for all computing” — the belief that once you get something into a digital format, your work is done.
To use a baseball analogy, there are nine innings to the game, and this is simply the second. You have a lot of pitching before you.