Authority, Authors, Experimentation, Peer Review, Reading, Social Media, Technology, Tools, World of Tomorrow

Will Editing Mix Machines With Humans? Dan Cohen Ponders the Future of Publishing

Fenwick Library

Fenwick Library (Photo credit: Wikipedia)

During the opening plenary of the SSP Annual Meeting Wednesday, Dan Cohen provided an interesting perspective on what might the world of scholarly publishing look like if it were a “digital native” — it was an interesting vision of new modes of scholarly communication that are based on social media, alternative metrics, and some examples of how scholars may navigate the onslaught of digitally distributed content.

Cohen, who is an Associate Professor in the Department of History and Art History at George Mason University and the Director of the Center for History and New Media, and the rest of the team at CHNM have been working actively to put into place some of these new technologies for scholarly information distribution. The Center has been tremendous leaders in digital information distribution and has supported a variety of projects from Zotero to ScholarPress and from PressForward to THATCamp. During his presentation, Cohen focused a great deal of time and attention on the PressForward service and some of the new publications that CHNM have produced using it.

Using PressForward, Cohen and Joan Fragaszy Troyano are now editing two new publicationsDigital Humanities Now and the Journal of Digital HumanitiesDigital Humanities Now began as a curation tool to extract humanities information from twitter. It now covers a variety of blog, social media and repository content that is available openly on the web. The Journal of Digital Humanities takes this curation process a step further — it consists of the best content exposed by Digital Humanities Now.

In some ways, these two publications are quite novel, automatically culling from thousands of daily posts, organizing and segmenting their respective content, and then algorithmically ranking the mass of content. Increasingly, as the mega journals such as PLoS ONE, Nature’s Scientific Reports, and the few others that exist begin collecting a wide range of diverse scholarly content, these overlay journals will become increasingly useful. Digital Humanities Now and the Journal of Digital Humanities are examples of how to bring this content together and what these overlay journals might look like.

But despite its novelty, the Journal of Digital Humanities is essentially doing what traditional journals and editors had done — gather, review, and validate content — but based on a curation model rather than one that is submission-based. Traditional journals had to generate sufficient reputation and distribution to generate paper submission from authors. If a traditional publication were lucky enough to have sufficient reputation for quality, it could be selective, or even especially selective of the content it published. In the case of the new journals, they are pushing out content that already exists and then curating the content based on what is available in the open web.

There is an internal bias with these two publications in that they rely solely on open access content, for a combination of business, practical, and legal reasons. Reuse of traditionally published content in this fashion is generally prohibited by copyright. As such, DHJ is a skewed sample of the work being done in the humanities, but it needn’t disqualify its relevance.  It is an open question as to whether these virtual publications will encourage additional publications to move toward greater re-use allowance, or whether certain paper repositories will develop around subject areas, from which copyrighted material can be culled from within a single publisher’s collection.

There is a parallel between these algorithmically generated journals and another trend in publishing that has emerged over the past year, which is the automatic generation of stories from structured data by companies such as Narrative Science and Automated Insights.  These services are nothing like the Lorem Ipsum dummy text generators, which simply spew text characters. These services create narrative text based on structured data to create news stories without human drafting or editing.  You might expect that these services are only used for populating advertising spam sites, but this is not the case. In fact, many reputable news publications are using these services to add content to their publications (while simultaneously cutting out their writing staff, we should note). Wired magazine featured an article on this last month entitled “Can an Algorithm Write a Better News Story Than a Human Reporter?”  I met Kristian Hammond at the Tools of Change conference earlier this year, and the service is impressive. I am not certain I agree with his belief that it will be less than 5 years before an auto-generated story wins some reporting award, there is certainly a place in our world for this auto-generated content. Similarly, there is a place for the auto-summarized journal.

Hopefully, we can all agree that machines are poor surrogates for humans in terms of curating and selecting content. But they do have strengths in speed and scale, which is important to match the pace of content creation. Assessing the quality will for the foreseeable future be the domain of editorial experts and peer-reviews. In many ways the quality of the journal will dependent on the quality of the filtering mechanisms.  Of course, this has always been true, it simply has taken decades for the highest quality titles to rise to the top. In our digital environment, it probably will happen much quicker. How these new forms of communication are accepted by the scholarly community, authors and the administrators who manage promotion and tenure committees is another important question that will take several years to work out.

Enhanced by Zemanta

About Todd A Carpenter

Todd is the Executive Director of the National Information Standards (NISO). He is focused on facilitating information exchange via standards, technology and business best practices within the US and internationally.


9 thoughts on “Will Editing Mix Machines With Humans? Dan Cohen Ponders the Future of Publishing

  1. Interesting article thanks Todd. With the majority of people considering themselves content creators now and with an over-abundance of information, the machinery and editorial processes of STM journals has never been so critical. To loosen the filter, as some new approaches appear to advocate, would be of no useful service to the professional community in my view.

    Posted by Andrew Miller | Jun 1, 2012, 6:09 am
    • My first comment, excellent!!!
      I certainly am not advocating the removal of filters and I agree that the filtering that scholarly journals (even beyond STM) provide is even more critical What is interesting is that new types of filtering is starting to open up. Over time these automated filters will improve It is unlikely they will be as nuanced and of as high quality as human filters, but there are positives and negatives to each approach (human versus machine). Automated filters needn’t be looser than human filters, they are just different. How the two interplay moving forward will be fascinating.

      Posted by toddacarpenter | Jun 1, 2012, 7:25 am
  2. Thanks for the thoughtful response to my talk, but this makes it seem like PressForward publications are almost entirely algorithmic. (Especially in your rather odd and inappropriate parallel to fully automated journalism.) I made it clear several times in my talk (indeed it was the main thesis of the talk) that we need the best of algorithmic and more traditional human editorial methods of selection. PressForward publications are hybrids of those two methods. The algorithms and associated technology (like RSS) help us find content of interest to a community of scholars, but humans have to check to make sure that content is of high quality before disseminating it.

    Readers can get a better sense of how, for instance, the Journal of Digital Humanities was put together in our editors’ introduction to the first issue. I would agree that we are trying to do something traditional as well, which is to do what print journals have done: providing new, important scholarship to people with limited attention. It’s different than completely personalized “publications” like Flipboard (which indeed are run by algorithms).

    As to the question of “internal bias” toward open access, if the Scholarly Kitchen will plead guilty to its internal bias against open access, I shall plead guilty as well.😉

    Posted by Dan Cohen (@dancohen) | Jun 1, 2012, 10:01 am
    • I don’t think Scholarly Kitchen has an internal bias against OA. As part of SK I just have a bias in favor of business models that work. I am even thinking of starting an author pays journal. Being skeptical of Utopian schemes is not a bias.

      Posted by David Wojick | Jun 1, 2012, 10:43 am
    • I work for a publisher with a strong history as a leader in open access publishing. I think it’s the ideal way scholarship should be done. I do think there are issues in the practicality of the proposed system and its implementation that should be openly discussed, criticized and improved upon though, and many unfortunately see any form of analysis as an attack.

      As always, the Scholarly Kitchen is a diverse group of authors bundled together under one umbrella, and while we each have our own biases, I’m not sure the group as a whole can agree on anything.

      Posted by David Crotty | Jun 1, 2012, 11:41 am
  3. Todd, you seem to be using the term “curation” in a new way to me. You say “… the Journal of Digital Humanities is essentially doing what traditional journals and editors had done — gather, review, and validate content — but based on a curation model rather than one that is submission-based.”

    What is a curation model? Do you mean search and collection? I think of curation as managing a digital archive. Is curation a new buzz word I need to know? I do algorithms.

    Posted by David Wojick | Jun 1, 2012, 10:58 am
    • Todd is using curation in the sense that seems to be winning out on the web, wherein curation means selection and organization of digital content. For some examples of this usage see the discussion of the differences between digital curation and digital preservation.

      Posted by tjowens (@tjowens) | Jun 1, 2012, 11:31 am
      • I like to say that confusion is the price of progress, so this is very interesting. I first encountered the term curation when I did staff work for the US Interagency Working Group on Digital Data (IWGDD). The focus was preservation, not selection, and I thought at the time that selection was the bigger issue. But collecting twitter messages strikes me as an odd use of the term curation. On the other hand, web search is called discovery, which I find especially distasteful in the context of science, where discovery should mean discovery. Looking for language is the mark of revolution.

        See also which is like a twitter collection. Dare we call it zettel?

        Posted by David Wojick | Jun 1, 2012, 1:41 pm


  1. Pingback: DCW Volume 1 Issue 2 – Open, Shameless, (Un)filtered - Jun 1, 2012

The Scholarly Kitchen on Twitter

Find Posts by Category

Find Posts by Date

June 2012
« May   Jul »
The mission of the Society for Scholarly Publishing (SSP) is "[t]o advance scholarly publishing and communication, and the professional development of its members through education, collaboration, and networking." SSP established The Scholarly Kitchen blog in February 2008 to keep SSP members and interested parties aware of new developments in publishing.
The Scholarly Kitchen is a moderated and independent blog. Opinions on The Scholarly Kitchen are those of the authors. They are not necessarily those held by the Society for Scholarly Publishing nor by their respective employers.
%d bloggers like this: