As we move further into a socially networked world, we learn more about both the value and the shortcomings of social technology approaches. From the crowdsourced investigation into the Boston Marathon bombing to restaurant reviews on Yelp, behavioral norms are solidifying, and a growing body of evidence is presented for our analysis. Framed through the lens of scholarly publishing, a picture emerges of the inherent conflicts between egalitarianism and expertise.
Despite the often overheated rhetoric of his argument, anyone working in scholarly publishing for the last decade will recognize a grain of truth in Evgeny Morozov’s diatribes against what he calls “solutionism”:
. . . an intellectual pathology that recognizes problems as problems based on just one criterion: whether they are “solvable” with a nice and clean technological solution at our disposal. . . . Given Silicon Valley’s digital hammers, all problems start looking like nails, and all solutions like apps.
I’m not sure how much of this approach is really an ingrained belief system as much as it is a business strategy. But ask any journal publisher, and they’ll likely tell you about the weekly, if not daily approaches from would-be technology entrepreneurs — here’s my new technology that’s going to radically remake your business, please either buy it, or give me free access to all of your assets and customers so I can make my fortune.
Indeed, there are exciting new technologies on the rise, but it’s often difficult to separate the wheat from the chaff. The notion of “change for the sake of change” can be fraught with peril for the not-for-profit press. Many simply don’t have the funds to make the sorts of constant mistakes a company like Google can afford (or more importantly, they’re part of research institutions and societies that can instead put those funds to more worthy causes, such as funding research and curing disease). As such, it is vital to carefully pick and choose the most promising technological revolutions.
Part of the technology sales pitch for the last several years has been a seemingly unlimited faith in social approaches to nearly every aspect of academic research and publishing. But as Jaron Lanier recently said, “being an absolutist is a certain way to become a failed technologist.” A better understanding of crowdsourcing and other social approaches can help us better target its implementation.
The two common functions offered by crowdsourcing are the distribution of large sets of work, and the democratization of opinion gathering.
Cognitive Surplus In Action
Crowdsourcing brings the epigram, “many hands make light the work” to life:
Done right, it’s a fantastic use of what Clay Shirky calls “cognitive surplus” – a souped-up phrase for “time when we don’t have something better to do”.
Cognitive surplus has given us Wikipedia, written by thousands of people who might have done nothing more than correct a grammatical error or insert a fact, yet have created a towering resource. It’s given us GalaxyZoo, in which non-expert users have classified hundreds of galaxies so that professional astronomers don’t have to and can figure out how they evolve.
Cognitive surplus, or even surplus computing cycles can offer value in scientific research, as projects like GalaxyZoo, Folding@home and Seti@home readily demonstrate. But for the most part, this is a brute force approach, churning through the drudgery of large data sets, the sort of work that Sydney Brenner famously quipped should be parceled out to prisoners, with the most work given to the worst offenders.
That’s an important limitation to recognize. These approaches aren’t about creating new insights or making intellectual leaps — they’re about sifting through large amounts of busywork so the real work of scientists can begin.
We received a painful lesson in this limitation in Boston, where well-intentioned but ultimately fruitless crimesolving efforts by web communities like Reddit and 4Chan tried to substitute crowdsourcing for expertise. The groups produced noise rather than signal, and ended up going down dark avenues of mob mentality. The case was, perhaps unsurprisingly, solved by experts using specialized tools and approaches rather than the brute force approach by the online public.
That’s an apt metaphor for scholarly research. Science doesn’t work in a egalitarian fashion. The questions in Boston, much like most scientific questions, required specialized training, experience, and expertise. You can’t crowdsource a Richard Feynman or a Barbara McClintock. We rely on these sorts of brilliant individuals to evaluate and understand what’s truly important and meaningful.
The Voice of the Crowd
The other great benefit of crowdsourcing comes from letting anyone and everyone have a voice. We have quickly come to rely on user reviews for nearly everything, from purchasing a washing machine to finding a good restaurant in a new town (though hopefully you haven’t had much use for this sort of review).
The same questions of expertise versus democratization occur here as well. We rely on experts for things like peer review. Is this paper accurate? Is it meaningful and useful? Study sections made up of experts often determine distribution of research grant funds. Are these questions that instead should be crowdsourced and put to a popular vote? Should a bottom-up approach replace the top-down system in place?
Many altmetrics approaches are based on measuring popularity of articles rather than quality — how many Facebook likes did the article gather, rather than asking whether it drove further science. Kickstarter is being bandied about as a model for science funding. Peer review has become something of a whipping boy, and we regularly hear of plans to scrap it altogether in favor of letting the crowd have its say post-publication.
Are these sorts of approaches really appropriate for science? Science is not about what’s popular or well-liked. It’s about what’s accurate and true. Being nice or fair doesn’t enter into it, and truth is not subject to a vote.
There are many people out there who believe that vaccines are dangerous — what happens to my tenure bid if this group votes my pro-vaccine paper down? Would anyone dare work on a controversial topic in a system where following and conformity are the favored results?
Aside from the obvious questions of gaming the system and clear inefficiencies and time sinks created, the behavioral norms that have emerged from social rating systems offer fascinating glimpses into the psychology of participation and the potential pitfalls for their use in science:
It is precisely in this vast range of online activity where the value and interest lie for researchers investigating what is not actually known as “criticism” but, rather, “electronic word of mouth.” The trove of data generated from online reviews, the thinking goes, may offer quantitative insight into a perpetually elusive dynamic: the formation of judgments, the expression of preferences, the mechanics of taste. The results, filled with subtle biases and conformity effects, are not always pretty.
Should taste and opinion enter into the official context of how a researcher’s work is judged? I’m not sure they can be entirely avoided, but our current system (in the hands of a good editor) opts for a small number of informed opinions over a potentially large number of less-informed opinions. Finding the one right peer reviewer for a paper is more revelatory than finding lots and lots of reviewers without the same expertise.
Rating and review systems show common phenomena, things like “authority signaling” and “petty despotism.” Cults of personality form, where reviews are based more on community than on quality. The system itself begins to hold a major influence, as early reviews often become the default authority on a subject, and over time, people respond more to the reviews than to the object being reviewed.
Think it doesn’t happen in science? Here’s a recent example from PLoS ONE, which has a discussion of the vocabulary and tone of the first comment that’s twice the size of the actual comments on the paper itself.
All of these behaviors are fascinating and provide ample fodder for the next generation of social scientists. But they offer potential distortions of the goals the systems are trying to achieve.
The key question then is whether egalitarianism really the right approach for areas where we are striving for authoritative expertise. Can crowdsourcing drive excellence?
We must always separate the technology from the sales pitch. Finding the right tool for the right job is essential to success, and a key part of the publisher’s role. Social approaches can offer enormous value, but they must occur in the right context for the results offered. It may not seem fair to everyone, but not everything in this world should be put to a vote — even if we have the technology to make it so.
Discussion
23 Thoughts on "The Limits of Crowdsourcing in the Scientific Disciplines"
“It’s given us GalaxyZoo, in which non-expert users have classified hundreds of galaxies”.
Correction: It’s actually millions.
“….in the 14 months the site was up we received a little more than 60 million classifications.” http://www.galaxyzoo.org/#/story
In his book, “The wisdom of crowds” (2004), James Surowiecki outlines the conditions under which crowdsourcing does a better job than individual evaluation. “Wise crowds” need:
1. Diversity of opinion, as the diversity brings in different information.
2. Independence of crowd members from each other, which prevents people from being swayed by a single opinion leader.
3. Decentralization, and
4. A method for aggregating all of the individual opinions.
The condition of independence is a real problem in a post-publication evaluation model as each reader is exposed to all prior evaluations. Taken to extremes, you get the PLoS ONE Hand Clenching scenario.
One solution would be to make all of the post-publication comments and metrics invisible to readers for a period of time (say, the first year or two), but this invalidates one key rationale for having post-publication in the first place: For journals that do not rely on selection as a quality indicator, readers have no signal as to what is worth reading. Crowdsourced post-publication evaluation may provide a better evaluation of the merits of a paper than a model built on expert pre-publication, but it is just as likely (perhaps more) to prove a worse evaluation.
The problem with crowdsourcing in science is not the global village, but what to do with all the global village idiots.
I agree with most of this but you seem to be focusing on those social schemes that are oversold. Perhaps that is your intent but there are ways in which social stuff can be quite useful. I have no idea if there are vendors for them but there should be. Here are two examples.
The first is fact finding. Reviews are not about voting; you have to read them for the specifics they provide. This restaurant has parking problems. Getting this device apart is tricky. In principle this kind of information can be mined in useful ways. In science an example might be a field that has specific method issues.
The second is controversy. As an issue analyst I have noted that it is very difficult to look at the journal literature and see the underlying controversies. Where are the big battles, the competing schools of thought, the rise and fall of ideas? The frontier is alive with these sorts of things but they mostly occur in the Q&A after the conference presentation, not in the journals. Social media sometimes bring these fights to the fore. This could be useful in graduate education or in RFP development.
It is also worth noting that the shortcomings you describe also occur in science itself because science is a social activity. Fads in research are common for example. Social media has not invented social activity it has just made it more visible. Understanding these shortcomings can feed back into how we do science socially. In some cases it already has.
My intent was more to think about specificity, and to try to draw some lines as to where egalitarian approaches are more likely to be successful, and areas where a top-down approach remains the best path.
I also think it’s necessary to separate out the idea of conversation around a paper from the idea of a critical review of the paper itself. Do commenting systems lead more to “this paper on X made me think about Y” than “this paper on X lacks proper controls”?
I’d also argue that the controversy you’re seeking to see has shifted to even more private venues. It’s been a long time since I’ve seen a really contentious question asked on the floor during a talk at a big science meeting. Those sorts of questions are now asked, if they’re asked at all, during the poster sessions, which offer more of a one-to-one discussion rather than a public forum. That trend, to me, does not bode well for the notion of a public airing of controversy through social media channels.
It also does not bode well for scientific communication. How are the funders, grad students and even other researchers supposed to know what the important issues are? One-on-one is a weak diffusion mechanism. It would be ironic if science became less social because of the threat of social media but I can see how that might happen.
I think it’s a reflection of the economic realities of being a scientist these days. Money is incredibly tight, and any activity that might result in insulting someone sitting on a grant review committee or a hiring committee for one of your students becomes increasingly dangerous. Similarly, Phil has written about the intense pressures to keep one’s data as private as possible as well:
http://scholarlykitchen.sspnet.org/2010/10/25/openness-and-secrecy-in-science-a-careful-balance/
Arguably scholarly publishing is and always has been a crowd sourced social network. For sure it has some unique and nuanced attributes. For example, it depends on editors to lubricate the review process and to grade and reward reviewers. It uses blinding for a short period of the overall flow process as a way to control spin and allow candid review.
If you were to modify a generalized crowd sourcing model to fit the needs of research evaluation, you would quickly come full circle to introduced the nuanced attributes are already deployed by successful journals to manage their social networks of authors, reviewers and editors.
I, like DavidW, agree with your argument and think DavidW has contributed some good points. I am rather lucky, I am not a scientist but knew many over a lifetime of publishing science. One thing I always noticed was that the lights burned all night long in the labs. That science, for the most part, is a lonely endeavor. Further, that the vast majority of folks in the other than science world have no idea of what they are talking about when it comes to science. They don’t even know what a theory is. In this light, I find the opening of specificity to the masses only tends to produce more chaff than wheat and yet one has to treat each bit of chaff as if it were wheat.
Lastly, I find that most scientists only comment on something they know about. In short, a physical chemist is rather hesitant to comment on the work of say an evolutionary biologist. In the world of crowdsourcing everyone who reads the article has a say. It is for this reason that the article is lost and grammar is discussed.
Yes but in the vast majority of cases only an expert will want to read the article. I think people are misled by a few cases where science intensive public policy issues have generated tremendous public interest, as they should.
I think the best model is the listserv, which is quite popular in science. For example I am a member of Eugene Garfield’s Sigmetrics listserv. That is a social medium so in a sense we are already there. Then too I run a Yahoo! group where the messages are public but only members can comment. There are many ways to approach this knowledge and issue diffusion issue.
I am curious about the idea of their being just one right reviewer for an article. I am working on an algorithm for finding reviewers and I can find a dozen best candidates or so depending on the local structure of the field. What criterion or filter might narrow that group down to just one? I am somewhat skeptical given that every paper has more than one important aspect, as does every reviewer.
“One” is probably a bit of hyperbole. But the editor’s role is to find qualified reviewers for the paper. And there are occasions where a work directly impacts or reflects the work of another research group, so much so that the ideal is to have that group serve as the reviewer for the paper to get a truly informed opinion.
Good example but it is a case of an issue or controversy. The editor first needs to know about it.
If you’re a good editor, you know your field well. You keep your ear to the ground. You also put together a team of Associate Editors and an Editorial Board to provide subject level expertise and current knowledge (not sure if this counts as crowdsourcing since it is a carefully selected and limited group).
Still, sometimes you unknowingly run into controversies and only find out about them through peer reviewers or the authors’ response to reviewers.At that point, you often take additional measures, things like finding additional neutral reviewers, offering the other side of the controversy a chance to respond through a commentary or dismissing the reviews of those with a conflict of interest.
I’m not sure there’s an algorithm that can spot this sort of thing for you and save you the leg work.
First of all my algorithm simply finds all and only those people working on a given problem, by degree of closeness. I am pretty sure that no one can know this for most problems because science is just too large and distributed. We ran several trials with DOE program managers looking for proposal reviewers and always found a significant number of candidates that the managers did not know about. In other words no one can know the field completely.
Second some of my team explored the specific controversy of competing hypotheses. They found a method that not only shows the competing schools but indicates when the community transitions by accepting one as settled. See “The dynamics of scientific discovery: the spread of ideas and structural transitions in collaboration networks” by Luís M. A. Bettencourt, et al.
http://www.osti.gov/innovation/research/diffusion/OSTIBettencourtKaiser.pdf
Peer review arguably should be different before and after such transitions.
You’re right to be skeptical! If the algorithm focuses only on subject expertise it will be missing the point. The “right” reviewer has several dimensions, including:
• Subject match
• No conflict of interest (did they work together before? ORCID we need you)
• Availability (not on holiday etc.)
• Motivation (do they have a reason to invest time preparing the review?)
• Good written communicator (accurate but poorly written review is of no help to author)
• Meets deadlines (we all want faster turn around for the author – right?)
• Fair minded/integrity (known bias on this subject?)
• Credentials (are they employed by a credible organization?)
• Specific technical expertise if appropriate (e.g. statistical reviewer)
A successful journal is constantly gathering data to help assess reviewer quality using all these criteria to inform future reviewer selections.
Systems and algorithms are essential performance enhancers for editors, but it will be a while before scholarly editors can be fully disintermediated by a silicone valley algorithm that finds just the “right” reviewer all by itself.
Remember Thomas Kuhn and his distinction between “normal” and paradigm shifting science? It strikes me that traditional peer review makes perfectly good sense for normal science, but that given its very assumption-challenging nature, paradigm-shifting science would benefit from crowdsourcing, so as to elicit comments from outside the closed circle of normal science. Indeed, traditional peer review might be counterproductive for this kind of revolutionary science.
Also, especially for interdisciplinary work in science and the humanities, a more open review might be truly beneficial. As I recall, Kathleen Fitzpatrick got some very helpful comments from people on her book, posted on MediaCommons in preprint form, who were entirely outside of academe and would not have likely been consulted by any acquiring editor. I can see this happening for some areas, and some types of, science as well.
Another beneficial use of crowdsourcing you did not mention–probably because it has little application to science–is transcription of manuscript documents to make then digitally accessible and searchable.
I think it’s a really interesting question–a system like this should, at least in theory, catch all the dissenting views, and if you can separate out the crackpots from the geniuses, you might revolutionize a field. But at the same time, these systems in practice repeatedly show a tendency toward conformity, with like ideas reinforcing like ideas. So whether they’re useful for catching notice of the outliers or if they instead powerfully maintain the status quo is unclear.
Since I have been boasting about my algorithm (called the X-Portal) I suppose I should disclose funding as that is the latest fad a la the OSTP memo. Developed was funded by the Small Business Innovation Research (SBIR) program of the US Department of Energy’s Office of Science. Given the commercialization requirements of the SBIR program this makes me a pesky vendor of the sort David complains about.
Really interesting David, thanks! Given your interest, I think that you (and the other readers here) would be really interested in some recent research that I have come across that theorizes about crowds and such similar phenomena. It’s called “The Theory of Crowd Capital” and you can download it here if you’re interested: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2193115
In my view it provides a powerful, yet simple model, getting to the heart of the matter. Enjoy!