Crowd exiting ferry
Image via Wikipedia

Last week, during the PSP plenary debate that touched on the future of the reference book, the opposition made two statements as if they were unassailable facts:

  1. Getting authors to write things is expensive and requires a lot of motivation that only the prestige and importance of the current system can generate
  2. Quality reference information can’t be generated via crowdsourcing

These are not unusual arguments in support of traditional publishing approaches, and by raising them, I’m not revisiting the debate. The debate merely provides a good recent example. I bring it up because recently, a study was published that throws both of these common assertions into some doubt. And by examining the study and extending the logic of its findings, I think it also hints that perhaps collaboration in scholarly publishing is more important generally than we might appreciate, something that has implications for those proposing new forms of peer-review, like the always provocative Vitek Tracz.

The study is quite clever. The researchers posited that Amazon’s Mechanical Turk could be used to test collaborative writing of reference entries. The authors partitioned tasks so that no participant had to do too much work. Various participants were asked to create an outline, gather facts, assemble the facts into coherent paragraphs, do final editing, format the entry, and check for grammatical errors. In one case, they compared an entry about New York City written by low-paid “turkers” on Mechanical Turk to an entry on Wikipedia as well as to an entry written by a single author. The entries were evaluated by 15 readers who used a five-point Likert scales to rate things like accuracy, trust, and preference. Across the board, the entries emanating from collaborative environments were longer and scored better, with the Mechanical Turk articles scoring the best of the three. In fact, group articles had many advantages:

Not only was the average quality of the group articles higher than the individually written ones, but . . . the variability was lower as well (t(11)=-2.43, p=.03), with a lower proportion of poor articles.

Also interesting is that the articles cost an average of $3.26 to produce, and required an average of 36 subtasks, each performed by an individual worker. This equates to each worker getting less that $0.10 on average for the work done.

Apparently, there are ways to get work done that don’t require the expensive, centralized resources of a major publisher.

The study leaves many questions untouched, such as whether more technical or cutting-edge information could be tackled in a similar manner. But it’s an enticing possibility, especially given the fact that many people on Mechanical Turk are seeking to supplement meager wages yet are likely possess fairly high levels of education and technical capacity. In fact, its demographic is very reflective of the general population.

Of course, it’s probably not lost on anyone that the way the researchers broke down the tasks of assembling a reference work is analogous to how jobs are assigned in news, science, and research publications. This isn’t “crowdsourcing” in the random sense, but organized crowdsourcing. You have someone who outlines the area or story, reporters or researchers or interns who gather facts, an editor, author, or writer who drafts a coherent version, another editor who improves it, and a copy editor to give it final polish.

For scholarly publishers, it shouldn’t be surprising that a highly dispersed network of workers can do this — we manage a network like this all the time, with authors collaborating across multiple institutions to generate a final report, editors and reviewers collaborating from remote locations, editorial staff often dispersed across organizations or locales, and copy editing and composition done more and more by third-parties — yet our output is quite good and very uniform.

It’s fashionable in some quarters now to say that traditional editorial work — peer-review, in-house editing to high standards, multiple iterations with authors to get things sorted out — is an outdated and over-engineered filtering approach, something we don’t need because it just inhibits the necessity of getting all the science reports out for all to see. But those nagging concepts of quality, bias, scarce time, and novelty haunt us still. This study provides an interesting twist that critics of traditional review should ponder.

Models of post-publication review like those advocated by Vitak Tracz and his “Faculty of 1000” might be missing a crucial element of pre-publication review — namely, the improvements garnered through structured collaboration. In most high-quality journals, review is only partially about gate-keeping — it’s also, and in many cases mostly, about an exchange with authors structured to generate the most clarity, relevant data, concision, and precision of expression. And that takes many people working together in many roles.

Perhaps the success of traditional peer-review is only partially about rigor or a “high bar,” with a good portion of the value emanating from a tight collaborative process for those papers deemed of sufficient interest to undergo collaborative editing.

Collaboration between anonymous but well-directed players has been demonstrated again and again, including an interesting experiment published last year in which 57,000 gamers beat a computer at folding complex proteins. This new study about how reference works can be generated suggests that collaboration itself is vital to higher quality outputs, but also suggests that there are new, less expensive, and effective ways to collaborate. It leaves unanswered the question whether we should collaborate before or after publication — which is not a mutually exclusive choice. But it does make me think a little differently about the strength of pre-publication peer-review.

Maybe one strength we’ve been overlooking is that traditional peer-review forces smart people to collaborate. Post-publication review doesn’t seem positioned to harness this advantage.

Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.


2 Thoughts on "Crowdsourcing, Reference Works, and Peer-Review: Some Surprising Connections"

It’s interesting that you bring this up in terms of the Faculty of 1000, which is a paywall subscription product, run for profit by a professional staff of editors. Compare their success and regular rate of new material to the truly crowdsourced alternative, The Third Reviewer, which hasn’t had a new posting for over 3 months, and is overwhelmingly populated with anonymous posts. It seems that even crowdsourcing works better with editorial oversight and a staff dedicated to the project.

And I have to say that I was hoping the phrase “Facebook of Science” had long ago been put to rest.

Thanks for publicizing this study. I’m intrigued by the researchers’ creative approach to solving a common writing task. And of course they score lots of “cool” points for using MTurk. Utlimately, though, what the researchers did not include makes me question their conclusions.

The MTurk cost per article is stunningly low, but the researchers do not discuss the impact on the editor or project manager. I suspect the MTurk approach requires significantly more time than the traditional process of an experienced, well-connected editor calling a trusted author to commission an article.

The researchers’s definition of “quality” comes up short. Grammar and the somewhat vague “use of facts” just don’t cut it. There is no mention of how relevant or unique the article is. Most importantly, there is no indication that the researchers checked for plagiarism. A situation inwhich unknown workers likely unfamiliar with publishing being paid $.05 for a task does not, on the surface, inspire trust. At least not the level of trust a commercial publisher requires to consistently publish quality products (and avoid lawsuits).

Finally, it would have been useful if the researchers published the actual articles written by their MTurk authors. It’s hard to verify the proof when there’s no pudding.

Comments are closed.