Last week, during the PSP plenary debate that touched on the future of the reference book, the opposition made two statements as if they were unassailable facts:
- Getting authors to write things is expensive and requires a lot of motivation that only the prestige and importance of the current system can generate
- Quality reference information can’t be generated via crowdsourcing
These are not unusual arguments in support of traditional publishing approaches, and by raising them, I’m not revisiting the debate. The debate merely provides a good recent example. I bring it up because recently, a study was published that throws both of these common assertions into some doubt. And by examining the study and extending the logic of its findings, I think it also hints that perhaps collaboration in scholarly publishing is more important generally than we might appreciate, something that has implications for those proposing new forms of peer-review, like the always provocative Vitek Tracz.
The study is quite clever. The researchers posited that Amazon’s Mechanical Turk could be used to test collaborative writing of reference entries. The authors partitioned tasks so that no participant had to do too much work. Various participants were asked to create an outline, gather facts, assemble the facts into coherent paragraphs, do final editing, format the entry, and check for grammatical errors. In one case, they compared an entry about New York City written by low-paid “turkers” on Mechanical Turk to an entry on Wikipedia as well as to an entry written by a single author. The entries were evaluated by 15 readers who used a five-point Likert scales to rate things like accuracy, trust, and preference. Across the board, the entries emanating from collaborative environments were longer and scored better, with the Mechanical Turk articles scoring the best of the three. In fact, group articles had many advantages:
Not only was the average quality of the group articles higher than the individually written ones, but . . . the variability was lower as well (t(11)=-2.43, p=.03), with a lower proportion of poor articles.
Also interesting is that the articles cost an average of $3.26 to produce, and required an average of 36 subtasks, each performed by an individual worker. This equates to each worker getting less that $0.10 on average for the work done.
Apparently, there are ways to get work done that don’t require the expensive, centralized resources of a major publisher.
The study leaves many questions untouched, such as whether more technical or cutting-edge information could be tackled in a similar manner. But it’s an enticing possibility, especially given the fact that many people on Mechanical Turk are seeking to supplement meager wages yet are likely possess fairly high levels of education and technical capacity. In fact, its demographic is very reflective of the general population.
Of course, it’s probably not lost on anyone that the way the researchers broke down the tasks of assembling a reference work is analogous to how jobs are assigned in news, science, and research publications. This isn’t “crowdsourcing” in the random sense, but organized crowdsourcing. You have someone who outlines the area or story, reporters or researchers or interns who gather facts, an editor, author, or writer who drafts a coherent version, another editor who improves it, and a copy editor to give it final polish.
For scholarly publishers, it shouldn’t be surprising that a highly dispersed network of workers can do this — we manage a network like this all the time, with authors collaborating across multiple institutions to generate a final report, editors and reviewers collaborating from remote locations, editorial staff often dispersed across organizations or locales, and copy editing and composition done more and more by third-parties — yet our output is quite good and very uniform.
It’s fashionable in some quarters now to say that traditional editorial work — peer-review, in-house editing to high standards, multiple iterations with authors to get things sorted out — is an outdated and over-engineered filtering approach, something we don’t need because it just inhibits the necessity of getting all the science reports out for all to see. But those nagging concepts of quality, bias, scarce time, and novelty haunt us still. This study provides an interesting twist that critics of traditional review should ponder.
Models of post-publication review like those advocated by Vitak Tracz and his “Faculty of 1000” might be missing a crucial element of pre-publication review — namely, the improvements garnered through structured collaboration. In most high-quality journals, review is only partially about gate-keeping — it’s also, and in many cases mostly, about an exchange with authors structured to generate the most clarity, relevant data, concision, and precision of expression. And that takes many people working together in many roles.
Perhaps the success of traditional peer-review is only partially about rigor or a “high bar,” with a good portion of the value emanating from a tight collaborative process for those papers deemed of sufficient interest to undergo collaborative editing.
Collaboration between anonymous but well-directed players has been demonstrated again and again, including an interesting experiment published last year in which 57,000 gamers beat a computer at folding complex proteins. This new study about how reference works can be generated suggests that collaboration itself is vital to higher quality outputs, but also suggests that there are new, less expensive, and effective ways to collaborate. It leaves unanswered the question whether we should collaborate before or after publication — which is not a mutually exclusive choice. But it does make me think a little differently about the strength of pre-publication peer-review.
Maybe one strength we’ve been overlooking is that traditional peer-review forces smart people to collaborate. Post-publication review doesn’t seem positioned to harness this advantage.