This is not a post written with any confidence that you’ll enjoy reading it. The topic is fuzzy, the trend unclear, and the implications difficult to grasp. It basically boils down to the increasingly apparent fact that publishing in the digital environment is an expensive, time-consuming new way to publish, despite early desires for it to be cheap, fast, and easy.
And that’s a disappointing reality.
While once thought to consist merely of “digitizing content” and derived from concepts firmly planted in the library/desktop use-case, digital media in scholarly and scientific publishing has evolved significantly as new devices have enabled new use-cases. This has forced publishers to address the use-case of the laptop, the e-reader, and the mobile device. It has also forced publishers to acknowledge more directly the concept of workflows. And, as consumer media has ratcheted up expectations with each Apple iOS or Google Android upgrade, scholarly publishers have found themselves forced to follow suit to some degree. A world of maintenance, upgrades, and software revisions has followed, with expensive expertise, significant costs, and new roles at publishing houses.
The fungibility of print is something we have yet to recreate, for print has an odd chameleon quality when it comes to use-cases. Desktop use-case? Print works. Mobile use-case? Print works. Laptop use-case? Print works. Reader use-case? Print works. No amount of X (XML, XSLT) seems to match its flexibility for reading, which is still a major activity in the research workflow, explained by the continued dominance of the PDF. At a recent meeting with scientists, one forward-thinking digital native reached out tentatively for a print journal lying on the table, picked it up, opened it, and marveled at the high-yield reading experience he found, using terms like “high throughput” and “information density” to describe the results of careful layout and typography of a print page spread. The iPad and laptop nearby couldn’t match the experience. There is a great deal of value still delivered by traditional page layout.
To achieve similar portability as users, we invest hundreds of dollars in devices. Yet we don’t acquire similar information density, or as some users term it, “high-yield” information experiences. Online information experiences are still frustratingly low-yield in many ways.
As publishers, one of the most expensive and time-consuming aspects is designing and platforming content to work across this broad set of devices, each with its own putative use-case and attempt at yield. Making each environment work in sync with the others is an even more complex process.
Mobile is currently the biggest area of expanding user preferences, so it’s no surprise that it is an expensive way to deliver content. It is also rapidly changing and full of churn. A recent article on the “Talking New Media” blog delved into the issue, quoting one unnamed publishing executive saying:
We waste more time fixing something that wasn’t broken in the first place. Worse, readers blame us when suddenly the app doesn’t work. We didn’t do anything, Apple did!
Complexity is the new normal, as all aspects of modern publishing — from authorship to editing to publishing to reading — has become more complex and involved. Reviewers have to remember scads of passwords across the multiple journals they help, and online review systems are themselves complex and intricate to learn and use. Editorial markup and workflows are more complex now. Metadata, article identifiers, feeds, downstream deliveries, and APIs all add complexity to each publishing event.
On the commercial front, monitoring subscriptions, customer data, financial data, creative materials, and marketing campaigns are activities that are orders of magnitude more complicated, time-consuming, and involved than in the past. There are some benefits to the complexity — more measurable results and some mysteries removed — but the overall system is overloaded with caution and complexity. To contemplate an advertising or marketing campaign involves not only creative work but many technical and data integrations. The scale of effort leads to caution and robs businesses of speed. There are simply not enough months in a budget year to accomplish all the marketing and advertising planned given how much time and effort each campaign takes to conceive and execute.
This complexity leads to a slower and lumpier commercial environment, one that favors bigger publishers and bundled deals. Making a purchase of any kind — for institutions or libraries, for advertisers, for aggregators, and even for individual subscribers — is now so much more complex that it’s better to get a lot for each transaction. Smaller, more frequent transactions are no longer as easy to make or even to contemplate, and justifying them is more and more difficult. Digital seems to be tilting the tables even further toward consolidation and bundled businesses.
The reality of digital publishing is proving to be quite different from the early promise. I say this as a member of the cohort that embraced it headlong in the mid-1990s and onward. The levels of complexity, the endless revision cycles, the uncertain commercial environment, the bilateral purchaser-seller costs which make transactions less frequent and more difficult, and the lingering misperception that all this can be made cheaper, faster, and easier with more technology — this is where we seem to be.
It’s a confounding position.
Where do we go from here? I’m not certain. I think an important starting point is to acknowledge that we were wrong in thinking that online or digital publishing could be cheaper and simpler than print publishing. Paper and ink were actually easier to buy and distribute than polished and well-placed pixels. Print advertising was more scalable. Print archives were easier to manage. The postal services were more consultative with publishers than Apple or Google. The ecosystem was less costly to operate within.
Once the expense and complexity is acknowledged and not pushed aside by the continuing hope that it will someday get easier, then we might be able to bring ourselves to say “No” to some trends, to find our proper technology solutions, and to stop chasing the upgrade monkey.
I think that’s all I have for this release of thoughts. As some future date uncertain, I’ll return with an upgrade. After all, that’s how the world works now.
36 Thoughts on "Confounded Complexity — Pondering the Endless Upgrade Paths of Digital Publishing"
Great post, Kent! I enjoyed reading it and it gives us lots of space to have a discussion!!
This post reminds me of a conversation I had YEARS ago (10?) with a book publishing executive. We were working on the transformation of our print textbooks to more appropriate forms for a digital environment and we could not see eye to eye on anything! When I reflected on the conversation I realized it was because he was thinking in a linear classic workflow process and I was thinking from an iterative technology perspective. For him “release” required perfection because, once released (printed), there was no fixing anything. I didn’t think that way. To me, there was no such thing as perfection because the target was in motion – to aspire toward it was delusional and a waste of time and money. You needed to get that 80-90% product out and then iterate on it based on customer feedback and usage.
Yes the process is iterative, but iteration isn’t always cheap or easy (especially at larger scales). It’s also even harder if you’ve constructed the environment in which you are iterating based on legacy constraints and thought processes. (One simple example of this would be applying the same rigor to a blog post as you would a peer-reviewed journal article.)
Even software development has gotten more complex in the past decade, also having to deal with different devices, browsers, versions of browsers, performance issues, etc. Managing constant change has been an issue for many industries.
No one ever thought software development was easy or cheap (and that is the model we basically follow now) – but the complexities and costs are simply in different places.
In a textbook, medical textbook or for that matter many STM publications 80-90% error rate is disaster and unacceptable.
I am out of the game now but was on campus the other day and talking with a prof about publishing. He wanted to know what happened to the number 10 envelope and flyers advertising new books and journals. He said those things really helped him keep abreast of what was new. I asked why he didn’t look at social media or web advertising. He said, those take time and effort on his behalf and those are things he just does not have!
I think we should listen to our audience and that audience seems to want us to keep print alive and to deliver on paper just as the forward digital native observed, it really serves a purpose and it works!
I am not a luddite and think digital and print can and do coexist but I am not too sure that just because we can gather data or quickly revise or what-have-you that it is necessary to do so.
As a mentor one time said to me: In the end, at least in business, sales cures all ills.
Thanks Harvey – this highlights another interesting point when discussing most topics, clarity is very difficult since we all interpret statements based on our own unique frame of reference. Let me be more clear.
The 80-90% I referred to above did not mean 80-90% accurate from a content perspective. I have spent most of my career working with some of the best medical publishers on the planet and accuracy in content contents (say that 10X fast!) is of the highest priority and must be ensured as much as humanly possible. The 80-90% refers to a broader measure of quality which also includes the features and functions that support content. Additionally, accuracy is only one dimension of quality – accuracy is not not the same as quality. Completeness, for example, is another measure of quality.
Quality is an ax that has been used to kill many a good idea prematurely (mind you some of those ideas deserved to die, but in the right time, with the right information, and in the right context). My comment about the blog post referred to the fact that there are different forms of communication and that different measures of quality apply to them (NOT accuracy which is less negotiable).
The only way for any organization to truly manage quality is to define what it means and determine how those elements are measured with regard to various communication methods. If that effort doesn’t take place you run the risk of applying the wrong measure to the wrong method (e.g., you get an inaccurate or incomplete print or ebook or maybe a Tweet that took a week to get peer-reviewed).
For the record – I love print, but it is only one of many outputs and it too needs the proper quality measures applied to it. I also completely agree with you that we need to listen to the customer and for many print (whether in marketing or publication) is what they prefer. The issue is that because print was first all of the definitions and measures of quality emerged from that one (yes, very important) content development stream.
Let print be print – but let other distribution methods also be shaped appropriately!
One bright side of digital publishing is multimedia, which enables publishers to bring the content in voice and moving picture. This was not possible in the print era. The scholarly publishing is yet to take a full advantage of this bright side, but it will change and then the strengths of digital will become more obvious. The financial industry is a good example where B2B publishing strongly benefits from digital.
But I agree that the need to do frequent technical upgrades takes significant resources and is distracting from focusing on content.
I’m with Moshe on the potential for digital but would add that this potential can be frightening for some. I remember the transition my grad students and I went through when we shifted from using a text based BBS to Gopher and then to HTTP for an online K-12 teacher outreach program we were running . At first, all we had to work with was 80 column monochrome text. No font choices whatsoever, no images and certainly no media. Getting the words right was plenty challenging.
Then came images, a wide variety of font options and formatting with tables visible and not. Some were delighted at the prospect of a richer palette while others shrank from it. All who stuck with it had to adapt and learn how to resist the temptations that obfuscate rather than illuminate.
We are still in the throes of that transition.
Great post. No surprise. Appreciate your honesty on this topic. I never saw digital as easier or less expensive or easier to flip back and forth with. I do believe it has a viable and important place but not as the de facto replacement for all literature for all readers in all instances. Software may not always have to go offshore if the software folks had editorial salaries. Imagine how costly it could be if parity went the other way. Economics and usability can’t be overlooked. Data overload is also not useful in administration or marketing. We will see a true viable integration of digital and print in the next 3-5 years.
A lot of the complexity on the editorial side stems from publishers’ attempts to recreate digital replica editions of their print products for the plethora of form factors that exist across all of the phones and tablets being produced. Fortunately the market is solving that problem: there is virtually no market for digital replica editions. Publishers could abandon their efforts in this area with virtually no repercussions.
It is interesting that both the New York Times and Wall Street Journal have created very good (and successful) digital alternatives to their print editions. Personally, I prefer to read both on my tablet than on paper. Yes, they still face economic challenges because their model is built on large contributions from advertising revenue and online does not tend to support this model. However, that should not be as much as a challenge for scholarly publishers given the relatively high subscription pricing and exponentially smaller amount of content being handled.
Digital can be cheaper and easier once the infrastructure is in place … but to Ms. Michael’s point above, it requires publishers to let go of their linear thinking and be open to new approaches.
A few expansions here:
The “complexity on the editorial side” is an interesting concept to contemplate in light of other comments (and known factors) around multimedia usage and data integrations. The complexity that streams from these initiatives is substantial, including new forms of peer review, new upload and file management capabilities, new players and display interfaces, and new metrics to capture and absorb. None of this complexity is about “publishers’ attempts to recreate digital replica editions of their print products.” Quite the opposite, in fact.
As to the assertion that “there is virtually no market for digital replica editions,” this is demonstrably false. The organization I just joined has a thriving digital replica edition, with a circulation larger than most journal circulations just by itself. And it is growing. Often, a technology is dismissed on its first time around, but gradually builds a legitimate user base and then takes off unexpectedly. Podcasts have seen a major revival in the past 2-3 years, after being written off by the tech press as dying.
The New York Times and Wall Street Journal stories are complex, and reflect everything said here — what they’re doing is expensive, difficult, complicated, and less profitable. The New York Times’ “Innovation Report” captures many of these threads in prose that ripples with concern, tension, and fatigue.
As for publishers being open to new approaches, they are. It’s our readers who dictate the pace of change, and they still like their PDFs, their formal citations, their useful reference lists, and the speed and silence of reading.
I see a near constant stream of complaints from authors when we publish their articles online ahead of print, citable with a DOI. So many don’t consider it really “published” until they can have an old-style reference complete with volume and page numbers.
Are the complaints about citability?
I remember hearing concerns from authors back when I worked at a digital only journal that relied on a URL and DOI for citation with no volume, issue, or page numbers (we actually gave the journal issue and ‘page’ numbers in response to these concerns). The problem was that most journal instructions for authors didn’t allow for URLs or DOIs, and editors would strip them out, making articles unfindable. Authors were rightly concerned that their citations wouldn’t count.
Michael Clarke recently noted in a presentation at Charleston this month that platform and innovation costs were only around 2%-3% of a publishers overall costs. I was somewhat surprised to hear this and perhaps Mike can add more context here. I agree that the online publishing world is getting more complex, however working with a number of technology start ups, I have to say, the returns and value these new technologies can bring is immense, and the actual technical implementation is not that complex to what it was 5-10 years ago, we have got smarter. I’ve heard the saying ‘only needs 1 line of code’ many times now, and have actually witnessed this compared to how long it used to take to change systems and technology in the old days. That’s not to say though it doesn’t take a lot of brain work and head scratching to look into the future, and plan out how your online publishing programs will look in 2 years, 5 years, not sure we can comfortably look too much further out than this. It’s certainly an ever changing landscape to some degree, and there is a new generation of techie whizz kid entrepreneurs out there that can do amazing things. With direction and vision I do believe we have an exciting period ahead of us. My two cents
The 2%-3% number I mentioned in Charleston is the average professional publishers spend on their platform as a percentage of revenues. So a publisher with $50M in revenues is likely to spend in the vicinity of $1M on their platform. Obviously there are going to be many exceptions — this is just an average.
By “platform” I just mean the publisher’s distribution website, where users interact with the publisher’s content on the interwebs. This would include the user interface layer as well as the supporting infrastructure: the servers that house the content, the transforms that render XML into HTML, the data tables that capture analytics, the logic and customer data associated with access control, and so on. So this is a rough swag at the amount that publishers either pay their platform hosting provider (e.g. Silverchair, HighWire, Atypon, Publishing Technology, Semantico, etcetera) or that they spend in-house on their own proprietary systems.
Not included in this number are all the other systems and processes that publishers employ to produce digital content. So, NOT included are the costs of XML mark-up, DOI deposit, subscription fulfillment systems, content management systems, peer review management systems, digital ad servers, and so on.
So the 2%-3% figure is not the total cost of digital publishing, just the platform layer.
Salaries and internal systems are major components of technology costs. Just building and maintaining a system that manages video submissions is a major undertaking, for example. In addition, hiring people with video editing skills is non-trivial.
Yup. My number does not include something like a video submission system nor does it include any salaries associated with staff managing digital production, vendors, special projects, etc.
My point in Charleston was that the visible layer of a publisher’s site that users interact with is but the tip of the iceberg in terms of what goes into producing publications (the slide I put this number on had a rendering of an iceberg to accentuate that point). Adrian took a picture of it:
As an aside, probably 80% of requests for books in our academic research library … are specifically for print books because of he ease of being able to open several at the same time in front of you while studying.
I am behind hand reading this correspondence and the excellent post that preceded it but I hope I may add two points:
1. My main employment is CIBER Research which started in 2002. One of the early conclusions we came to is that researchers are part of normal life. They look up things on the web like other people. Every year they require a higher level of functionality and speed to what they are looking for. Every year publishers have to spend money keeping up to date.
2. I claim to have been among the first to put what were then print journals online. We decided on PDF full text and SGML as it was then metadata. For the DTD we chose Elsevier I.0. Within a year this was out of date. Elsevier were on 3.0 or something like that. We felt we had to keep up. We had to instruct our suppliers who had problems with DTDs in any case. It goes on and on.
Millions of ordinary people happily spend hours and hours reading mostly fiction on their iPads, Kindles, Nooks, PCs or iPhones. Why can’t the academic community pool its intelligence to overcome obsolete print and PDF? After all, we gave up the scroll long ago.
I wouldn’t consider a format (the typeset page with integrated figures) which has not only survived the transition from print to online but actually flourished during the transition to be “obsolete.”
I do not know when we gave up the scroll but I bet it took more than a couple of decades.
Giving up the scroll format in favor of codex format so long ago might exactly be the reason why we’re slow to overcome PDF articles. Navigating an HTML article today is very similar to navigating an ancient scroll: you get a long piece of text on a continuous sheet of (digital) paper, and you quite literally scroll through it to find the information you seek.
Sure, you have markup and other navigational aids to help you find and take in that information more efficiently, but reading that way is essentially counterintuitive to the way we’ve been consuming all sorts of literature for the past 1,500 years. Having that in mind, I really don’t see print and PDF becoming obsolete any time soon.
Counterintuitive?! I doubt the younger generation sees digital features as counterintuitive.
To the contrary, sticking with print is counterintuitive. Print wastes resources from the layout artist to the recycling bin. PDF, excused as a print format, doesn’t even fit the digital screen so you are forced to print it.
Why print an article when it is more easily read in HTML and filed online? Counterintuitive is building more library buildings – when digital storage is so much more effective and efficient?
Conspicuous investment in print serves only egos of authors and paychecks of file clerks. Beyond saving trees, labor, and overheads, digital links, bookmarks, and “find” features mitigate the scrolling of HTML formats. Moreover, many computers and index services allow you to search for a specific text across many files. Try and do that with your sacred vertical files.
But we didn’t give up the scroll. Check out that skinny column on the right side of your browser window that you use for navigating up and down this page.
Bear in mind that the PDF has a major advantage over typeset HTML that many readers find compelling; they are able to download it to their local hard drive and keep a copy. In terms of organizing a personal library of articles, you can’t download HTML and store it in folders or add it to your citation manager.
We seem to talk a lot in publishing about the reading experience or PDF vs HTML, but perhaps the workflow aspects are just as, if not even more, important.
I download html pages to my hard drive all the time.
It’s probably going to take longer because the beauty of PDF is the ability to print out pages and read them like a book or journal article.