system-63768__180The digital era has increasingly led to distributed networks and a move away from centralized locations for both people and data. As our ability to communicate, participate, and share has increased, our ecosystem has changed. 

This month we asked the Chefs: How has the move to distributed networks impacted scholarly publishing?

Joe Esposito: Rather than think about this in the present perfect tense, I would rather project into the future, where things get very different and probably uncomfortable for many readers of this blog. One of the structural advantages of the kind of publishing that PLOS ONE and its countless imitators does is that it is aligned with the underlying network upon which we all operate. Many of the things that don’t work very well today (e.g., post-publication peer review) have the advantage of swimming with the current of technology. Networks make expression more conversation-like, less fixed. We are at the very early beginning of a world without versions of record, fixed DOIs, and clear precedence. All this is anathema to established publishers, many of which retain me to tell them that they are doomed. It’s a great way to make a living. The distributed nature of networks undermines broad authority structures (e.g., brands, conventional practices such as peer review) and replaces them with a fitful pluralism. It is useful (not correct–what would that be?–but useful) to think of post-modernism not as a cultural paradigm but as a technological prediction.

Phill Jones: The move to cloud computing for productivity apps has had a direct effect on me personally by enabling me to have a job with a London based company while living in Edinburgh. In fact, the team in which I work is both globally distributed and closely knit thanks to technologies that let us send messages instantly, meet face to face, and collaborate on documents in the cloud. It’s amazing to me that I can do my part to contribute from anywhere in the world, so long as I have a device and occasional internet access. (Incidentally, I’m writing this on my phone while on a delayed plane, sitting on the tarmac in Newark.)

In terms of the products and services we offer, frankly, we haven’t begun to scratch the surface of what could be done. The question is: ‘How will it impact scholarly publishing?’

Rick Anderson: This is obviously a question that could be answered in any number of equally-valid ways, but the facet of this move that I personally find most interesting is the way in which it has turned publishers from widget-producers into service providers. Now, to be clear, publishers have always provided services to authors, but before the 1990s they mostly provided widgets (printed books or journal issues) to readers. They provided services to authors in return for publishing rights, and provided widgets to readers in return for money. Today, of course, it’s nowhere near that simple: in addition to providing authors the usual services under the usual terms, a growing number of publishers offer authors the option of paying to make their work available to the world for free — and some publishers, of course, operate exclusively on that model. From the reader’s side, it’s decreasingly common that they simply pay money for a physical object into which information is encoded. Instead, readers increasingly enter into service agreements with publishers whereby they are given access to hosted content and the publisher assumes the responsibility for ensuring ongoing access to that content. This shift — from the buying and selling of objects to the buying and selling of access rights — has had ramifications that have yet to be fully understood, and I’m not even sure that all of the ramifications have emerged yet.

Charlie Rapple: The move to distributed networks could be seen as one of the contributing factors to the healthy level of innovation in the scholarly publishing ecosystem. It has created a cultural and technological environment in which barriers to entry have been lowered and small organizations can thrive as connected nodes in (effectively) a distributed network. So start-ups can focus on building competence in a specialist function, without having to replicate related processes or data, because these can be bolted on from other providers — from those that are well-established (such as CrossRef) to other start-ups (e.g. partnerships between Kudos and TrendMD, or Altmetric and Mendeley). 

Another trend that could be considered a distributed network is the rise in citizen science, and projects like  Zooniverse which harness the power of hundreds of thousands of volunteers to analyze and annotate research data such as images of galaxies or videos of animals in their natural habitats. Such initiatives have their roots in distributed computing projects such as SETI, which encouraged people to use the idle power of their home computers to help in the search for extra-terrestrial intelligence. Zooniverse and other such “wisdom of crowds” projects show that, while in many contexts we no longer need to distribute computing power, our remaining need is for distributed people power. The impacts of this on scholarly publishing include the opening up of our value proposition to non-researchers — with Zooniverse’s “citizen scientists” often key to discoveries that are then published, there are implications for the language and formats in which research is communicated, the processes and business models by which it is made available, and the accessibility of all of these to people not steeped in scholarly publishing. 

Judy Luther: Distributed networks allow discovery to take place on a different platform than delivery. In contrast, aggregated collections such as JSTOR and Science Direct are sufficiently large to serve as destination sites where users find and access content subscribed by their institutions. The value of the latter is that the user has a seamless experience in connecting to the content. However, with Google as the dominant search engine, today the reality is that the academic user will link to content on a variety of platforms. This approach depends upon a reliable knowledge base and that can be a point of failure. Issues around this vulnerability have prompted Google to request access to the publishers’ subscription records and publishers are often reluctant to share this proprietary data with the information giant. Even with open access content the link between the point of discovery and delivery must be reliable.

One of the benefits of distributed networks is that they offer a single point of access where metrics can provide a more complete picture of usage data for authors, editors and publishers. Some large publishers and platforms provide an enhanced experience for users which is not possible to control on other platforms. Increasingly the variety of content including media and data requires more sophisticated support that cannot be replicated cost effectively. The opportunity to ‘publish once, view many times’ is one of the primary advantages of content on the Internet. Networks are designed for connections between users and content and leveraging that capability with necessary attention to the links offers broader access to more content.

David Crotty: One area where this change has had a direct impact is the growing question of the value of investing in big centralized repositories of information. We live in an increasingly distributed world and our search tools continue to improve their capabilities for ferreting out that information regardless of its location. So why should we bother to collect things in large and expensive baskets?

This question is playing out in the various responses seen to the US government’s policy on public access to research papers. The Department of Energy has chosen a forward-looking distributed approach with their PAGES service, which collects articles where necessary but focuses instead on metadata and providing pointers to content hosted elsewhere. Contrast this with the NIH’s PubMed Central, a centralized repository where all material is collected and stored in one place.

Is it still worth spending millions of dollars every year to keep a copy of everything in one place or is that an obsolete approach? The effectiveness (and particularly the cost-effectiveness) of these methodologies will be under great scrutiny over the coming years.

Ann Michael: As Phill’s response demonstrates, many employers are now able to leverage a much larger pool of job candidates. The geographic location of those candidates is often flexible because of increasingly effective tools and technologies for communication. What I find exceptionally interesting is that it was a bumpy ride at first, people and cultures (ingrained expectations) needed to “evolve” to take advantage of these tools. And while I am a firm believer that there is no substitute for face-to-face interaction, in many roles, there is an opportunity to balance in-person interaction with a well-performing virtual team. In our case our core team ranges from Boston to Florida to San Diego to Portland (with one Canadian!) and it works because of established practices, distributed networks, and an abundance of communication tools. God bless the Internet!

As the Chefs illustrate above, distributed networks have impacted our infrastructure, our work environments (remote or in-office), our ability to innovate, the discoverability of content, the data network that supports content, and the very nature of the product and services offered in our ecosystem (access versus physical goods). There is also pretty general agreement that in some places we’ve only just started to see the impact (products and services, as an example).

Now it’s your turn? How has the move to distributed networks impacted scholarly publishing? What have you seen? What do you expect to see?

Ann Michael

Ann Michael

Ann Michael is Chief Transformation Officer at AIP Publishing, leading the Data & Analytics, Product Innovation, Strategic Alignment Office, and Product Development and Operations teams. She also serves as Board Chair of Delta Think, a consultancy focused on strategy and innovation in scholarly communications. Throughout her career she has gained broad exposure to society and commercial scholarly publishers, librarians and library consortia, funders, and researchers. As an ardent believer in data informed decision-making, Ann was instrumental in the 2017 launch of the Delta Think Open Access Data & Analytics Tool, which tracks and assesses the impact of open access uptake and policies on the scholarly communications ecosystem. Additionally, Ann has served as Chief Digital Officer at PLOS, charged with driving execution and operations as well as their overall digital and supporting data strategy.

Discussion

25 Thoughts on "Ask The Chefs: How Has The Move To Distributed Networks Impacted Scholarly Publishing?"

If memory serves, the web is now some 40 years old and has been integrated into the vast majority of STEM companies. Journals use it to review, books use it to move ms from someones house to company which then moves it to where ever in the world it is copy edited and produced – ditto journals, and does all this via e files. As for the individual working in the industry, one can now work from home or the beach. Of course some companies want their employees on the spot but that proclivity does not effect or affect those that don’t!

My question regards the failures of the cloud, mist or whatever one wants to call it. That question is very simple: How can it assure me that I will sell more than 300 copies of an STEM or Humanities/Social Science title!

So I guess, what is the question being asked?

While it’s true that the Internet is 40 years old, what we now call the World-Wide Web is much younger–it came into existence after the creation of a graphical interface for the Internet, and that happened only about 20 years ago. This matters very much, because it was only with the emergence of the Web that online commerce (including scholarly communication) really became possible at scale.

True enough, but what makes the Web work is the http hypertext protocol. The graphical browser is important but not essential. It is all about linking, as that is what creates the network. The mouse is an extra benefit.

Berners-Lee’s WWW protocol was developed in late 1990/early 1991. Mosaic, the first browser with a graphical interface came out in ’93. The Mosaic team then launched Netscape in November 1994 — which I would argue is the true “Gutenberg moment” of the internet, when a number of existing technologies are finally combined in an innovative fashion that radically transforms communication.

Interesting post; thanks. I have two responses: one is that while “cloud” is a good metaphor (ethereal, floating, misty, etc), it’s not a good description of rooms filled with machines. The other is that publishers are not alone in anathematizing “a world without versions of record, fixed DOIs, and clear precedence.” As a librarian, a researcher, and a general reader, I need to find, use, and document reliable records of human knowledge. I want books, not some uncertain access to floating electrons!

Technically, cloud computing refers to a specific commercial service that rents unused time on a myriad of computers. If the web is now becoming the cloud, or perhaps the information on the web, that is indeed just a metaphor and not a particularly useful one in my opinion.

Charlie’s comments about “citizen science” reminded me of the great quote from Sydney Brenner about how sequencing a genome was such incredibly tedious work that rather than having scientists do it, we should force prisoners to do it as a form of punishment (http://www.sciencemag.org/content/291/5507/1182.1/F2.expansion).

It never fails to amaze that in the digital age there seems an endless supply of volunteers with time on their hands for the sort of tedious grunt work that you likely couldn’t pay someone to do.

it is a rather ironic observation given the forum (a blog with volunteer editors/contributors) and the industry that depends upon volunteer peer-reviewers! 🙂

I might counter that writing for this blog, as well as performing peer review, while demanding, are both activities that are far more engaging than sifting through a humongous pile of data and making simple yes/no determinations. Reading the blog, however, is entirely voluntary and perhaps inexplicable.

Ha! A nice quote, thanks. I too am fascinated by the ways that people choose to spend their time. I guess a change is as good as a rest. At the Sydney Conference in July we spent some time imagining life in 2472. We thought that no-one would need to work any more, because machines would be able to do everything (efficiently), so humans would spend their lives focussed on hobby projects (learning old skills just for the satisfaction), and debate, analysis, etc – things where the nuances of the human brain had yet to be replicated. Ultimately, doing (and enjoying) the grunt work because there’s no other work to do.

I remember hearing the person behind Zooniverse give an example of what motivated one of their participants, something along the lines of, “well, I finished all the gardening and there was nothing on the telly….”

The challenge for publishers can be summed up in this way: Napster vs. BitTorrent. Napster, as a centralized system, could be sued for infringement successfully. BitTorrent allows for the decentralization of infringement in such a way as to make controlling the use of content virtually impossible. The rise of BitTorrent systems may make the resort to OA more appealing because it takes away a major economic motivation for distributed systems to violate the law.

Another facet of distributed publishing is manifested in the Espresso Book Machine. Whereas traditionally manufacturing took place in a centralized place from which copies were distributed, devices like Espresso allow for manufacturing to take place in multiple places at the same time.

An important difference between both Napster and BitTorrent and scholarly monograph publishing, of course, is that Napster and BitTorrent are dealing in high-demand content. 😉

And another important aspect of the Espresso Book Machine that makes it a friendly solution for publishers and authors is the fact that copyright control and royalty payment are both built into its system.

Regarding David Crotty’s growing question of the value of investing in big centralized repositories of information, this clearly applies to libraries. Perhaps the new role for libraries is to become good nodes in the network. This means (1) collecting and making available to the rest of the network as much local knowledge as possible and (2) making the rest of the network as accessible locally as possible.

Is not the Digital Public Library of America the paradigm of a distributed library network?

No, Sandy, because each of the DPLA http://dp.la/info/ member libraries still has large non-local collections. On a true network model each library only has local content, that which no other library has. There is zero redundancy in collection content. The local repository is the physical library. Mind you I do not recommend this.

Regarding Joe’s statement about “…established publishers, many of which retain me to tell them that they are doomed.” I am available to tell them otherwise.

Great post(s) on a really important issue. A couple of comments:

Re Rick’s “This shift — from the buying and selling of objects to the buying and selling of access rights”: There’s actually another very important dimension on top of that: the shift isn’t to buying and selling of access rights, the shift is to _licensing_ access rights.

And re David’s “So why should we bother to collect things in large and expensive baskets?”: This is why I was so interested many years ago in the forward-looking (in the Humanities and Social Sciences, no less!) Europeana, which provides access to cultural resources from hundreds of cultural institutions throughout the EU. What struck me at the time was that they made no attempt to gather up the stuff; they realized that the solution was being a metadata hub to all the stuff. Brilliant!

In several groups I’m involved in, especially the IDPF (EPUB) and W3C (the Web), there is an active and complicated discussion about defining packaging of publications. We aren’t far off from a time when many publications will not necessarily be “packaged” at all, they will be metadata hubs to stuff. (Not to panic, I’m talking long-term, and note the word “many.”) Why the IDPF and W3C are in those complicated discussions is that we’d like a publication to be either, or both, of those things. There are college “textbook” platforms (I use the term advisedly) that have already crossed the line from package to hub. To take a cue from Mr. Colbert, they are losing their “thinginess.” 😉

Re Rick’s “This shift — from the buying and selling of objects to the buying and selling of access rights”: There’s actually another very important dimension on top of that: the shift isn’t to buying and selling of access rights, the shift is to _licensing_ access rights.

I think this is a distinction without a difference — in this context, licensing and buying are really the same thing. Licenses are the contracts that set out the terms under which access rights are bought and sold. It’s possible that you’re calling attention to the time-limited nature of a license (whereas a “purchase” usually connotes a permanent transfer), but when you’re talking about the purchase of rights rather than physical objects, what’s bought is almost always something temporary. And it’s possible (though relatively rare) to buy a permanent license anyway.

Understood. Perhaps this is a bigger issue in the book world. There, the confusion between “owning” a print book and . . . um . . . “buying” an ebook is an often misunderstood one. When folks acquire an ebook, they are really licensing it.

My point with regard to this post is that that is another important difference in this distributed network world. You don’t “own stuff” in the same way. (Admittedly I’m thinking about individuals, not libraries.)

David Crotty tells us that PAGES is distributed, while pubmed central is centralized. In reality, the opposite is true. PMC and Europe PMC form a distributed archive, hosting extra copies in addition to the original. PAGES links to the original, centralized copy. If the original publisher version becomes unavailable, the PAGES link will become useless, while the PMC copies will remain. As an example, try finding the 100+ computer-generated nonsense papers published by IEEE. Had they been deposited in PMC, they would still be available. Had PAGES linked to them, those links would now be useless, because IEEE has deleted the originals.

To be fair to PAGES, part of the arrangements with CHORUS requires replicative archiving with a service such as CLOCKSS or Portico. If the original paper disappears or ceases to be made freely available from the publisher, the archived version is made available. Publishers are contractually obliged to provide this guarantee as part of the CHORUS membership agreement. This does not generally create extra expense as most reputable publishers are already using such archiving services anyway.

The same can’t be said for papers collected and displayed centrally by PAGES itself.

Interesting semantic issue here. Apparently Thomas thinks that a network means everyone having copies of the same stuff, while I think of it as everyone seeing the same stuff, but only one copy is necessary. Of course some redundancy is needed but that can be a dark archive. As for PAGES, my understanding is that it is maintaining its own dark archive, not Clockss or Portico, but that may have changed.

But if PAGES is killed by a future administration, as may happen, the publisher version is still there, backed up by CLOCKSS and/or Portico. It is not the government’s job to back up scholarly communication, not yet anyway. PMC is an eyeball stealing redundancy.

From the NLM Preservation Policy: “In accordance with the terms of the NLM Act and the clearly expressed intent of Congress, the fundamental responsibility of the National Library of Medicine is to preserve permanently the content of books, periodicals, and other library materials pertinent to medicine.” Although many people continue to believe that PMC was developed in order to support the NIH Public Access Policy of 2007, it was actually created several years earlier (2000) in part to extend NLM’s preservation and access mandate to digital information. How effective it has been for that purpose is, of course, a worthy subject for discussion and debate.

Comments are closed.