Sigmund Freud, founder of psychoanalysis, smok...
Sigmund Freud, founder of psychoanalysis, smoking cigar. (Photo credit: Wikipedia)

Freud’s notorious question–What does a woman want?–came to mind recently as I was pondering the disconnect between the prevailing wisdom about what scientists want and what they tell me.  Of course I don’t get to speak to all scientists and my own sample is too small to be representative of anything, but I do spend day after day talking with scientists about scholarly communications.  Do you belong to professional societies?  What journals are central to your work?  Do you attend conferences?  What is your view of open access?  If you could fix one thing about scholarly communications, what would it be?

Foreshadowing:  I am going to share with you the one near-universal opinion that scientists hold about fixing scholarly communications.  But you have to read to the end to find out

The phrase “prevailing wisdom” may smell like a straw man to the logically minded, and, yeah, it’s wrong to lump so many points of view together, but I’m going to do it anyway and let you decide how much straw there is in this argument.  The conventional view (the straw man) is that what scientists want is access.  They are working at institutions that simply cannot afford to subscribe to the publications they require for their work.  Roadblocks to access are put up everywhere by publishers that do not have the best interests of the scientific community at heart.

On top of that, publishers exploit volunteer labor and have made the academic community dependent on the impact factor metric, which serves to strengthen the incumbent avaricious publishers.   The editorial gatekeeping model of the publishers is managed by non-scientists and adds no value to the process and may even subtract from it.  And so on.

I bet there is a kernel of truth to some of these broad generalizations (e.g, reviewers of journals articles receive no or little compensation, though the editors of journals are compensated, in some cases quite handsomely), but in my experience what scientists really want is quite different.

To begin with, none of the scientists I have interviewed has ever begun a discussion of journals by talking about what they read.  They talk about where they publish, where they would like to publish, and the procedures (including obstacles) to get published where they want to.  That is, the scientists in my tiny unrepresentative sample think of themselves as authors first, not readers.  So how important is access?  The most common response is “I get everything from the library” or “You can always get what you need.”  There is indeed a problem with access, but it is access for authors, not for readers.  Have the gatekeepers slammed the gates too hard?  Well, that is debatable.  As one would expect, the gate is hard to kick open for the individual author, but not too difficult (so my interviewees tell me) for the community at large.   Everybody likes gatekeepers–at journals, at private clubs, at the admissions departments of elite universities–provided that they can get through the gate.

On the matter of open access, no scientist I have spoken to is unaware of it, but it is mostly a marginal matter.  I recently remarked to a colleague, with whom I conduct these interviews, that about 1 scientist in 30 seemed to favor open access strongly, to which she responded, “If that.”  Surely there is a disconnect here.  It’s hard to imagine any topic that occupies more bandwidth in academic publishing than open access, and I will be disappointed if Stevan Harnad, Mike Taylor, and Michael Eisen don’t chime in on the comments section of this blog to point out how I have the whole thing wrong.  But that’s what scientists tell me.

On the other hand, although few scientists expressed to me a strong interest in open access in their role as readers, they are keenly interested in it in their role as writers.  That is, they believe that there is an audience for their work that does indeed have access problems, even though they themselves (that is, the authors) for the most part do not.  Is this not the hope of all authors, all musicians, all movie-makers, that there is pent-up demand for their work if only the barriers to access could be torn down?  Even a lowly blogger at, say, The Scholarly Kitchen (to choose a site at random), may dream of having a post picked up by, yes, Michael Eisen or even more significantly by Tim O’Reilly or the folks who run the Freshly Pressed selection at   (Inside baseball:  this post will probably be read by between 2,000 and 3,000 people.  In the unlikely event that it “goes viral” and reaches 10,000 views, I will be celebrated by my fellow chefs even as they plot to assassinate me.)

On peer review opinions vary.  It’s good, it’s very good, it could be better, it used to be better, it’s good in some places but not in others, it’s hard to do well, people don’t make an effort to do it well, and so forth.  On the other hand, the more limited form of peer review practiced by PLoS ONE (and widely imitated) triggers some skepticism.  In my experience most scientists express a preference, somewhere between mild and modest, for hybrid publications, which employ traditional peer review, but also have a reasonably priced open access option.  Publishers that have not already done so may wish to attempt to attract the authors who now publish with PLoS ONE with a hybrid program, which would include rigorous peer review and an “author-pays” fee of no more than PLoS ONE’s ($1,350).  Some publishers that have put a program like this in place have seen significant revenues from the author fees.

What do almost all scientists agree about?  The manuscript management systems employed by most publishers.  Scientists detest these systems.  They find them to be cumbersome and they express real frustration and sometimes outright anger over them.  I will forebear naming names here, but some of the most prominent publishers and vendors of workflow management systems come under fire.

I am not surprised by this, having suffered through this experience myself.  Publishers should think about this.  A publisher’s equity is in its ability to attract the finest authors.  So how does an author first interact with the publisher? By going online and struggling to upload a paper.  For publishers who have never used their own systems (I think this is a very high percentage), consider how you feel trying to get a problem solved with Verizon or AT&T or (to take an extreme example) United Airlines.  Poorly designed software interfaces combine with a strategy of never having anybody to answer telephones.  If people had their blood pressure checked just before boarding a United flight, we would be reading articles about a mass epidemic of hypertension.  But where would that article be published if the author is so frustrated by the publisher’s Web site?

There are two exceptions to this near-universal contempt for workflow management systems.  One author complained to me about using them and then remarked that the system at PLoS is not that bad–not, he said, that it is in itself a reason to publish with PLoS ONE.  Oh?  Maybe it is part of the reason.  I speculate that traditional publishers, who spend much of their time trying to make librarians happy (even as they charge as much for their materials as they can get away with), are at a disadvantage with pure Gold open access publishers, whose only job is to make authors happy.

But the second exception is more amusing.  I was intrigued by a group of scientists who praised the submission system at one particular journal.  So I investigated.  That particular journal, beloved by all its authors, accepts submissions as email attachments.  The managing editor calls authors to answer questions.  It does not use workflow management software.  Limited automation, which is expensive, but it makes the authors happy.  What is this telling us about the limitations of technology in publishing?

I repeat:  my sample of scientists is not large enough to be representative of the scientifc enterprise as a whole.  But I do wonder about surveys whose findings are the complete opposite of mine.  Why aren’t they talking to any of my guys?

Joseph Esposito

Joseph Esposito

Joe Esposito is a management consultant for the publishing and digital services industries. Joe focuses on organizational strategy and new business development. He is active in both the for-profit and not-for-profit areas.


51 Thoughts on "What Does a Scientist Want?"

I think your anectdotal data is probably statistically correct, Joe, and very useful. Regarding open access if one in thirty, or even just one in a hundred, scholars feel strongly about it that is quite sufficient to make it the strong political movement it seems to be. There are after all several million such scholars, so a few percent are a lot of voices. These seemingly low ratios are probably characteristic of most political movements.

As for unfriendly workflow management systems, my wife is an Internet shopper and so faces similar challenges. I hear a lot about glitches and poorly functioning interfaces, which do make a difference in who gets to sell successfully. Time is money so it is part of the price. Given that the APC journals are selling to authors, not libraries, perhaps they will pioneer a new generation of friendly interfaces, which everyone can then use.

Joe, in my experience your conclusions and observations ring true. Most scholar/scientists view themselves in the author role. Lurking underneath is the realization that a single journal article will never satisfy the hunger for fame. I have always observed either an immediate or distant ambition to write a fundamental textbook or popular book that would cement the scholar/scientist’s place in the world, providing both fame and fortune. Once this happened at my previous publishing house, when an author’s single scholarly book hit $1million in sales.

Three points: 1) Our scientific society has based its highest rewards on priority of discovery as recorded by publication. All else follows.

2) Considering that half of all authors are responsible for only one published article (according to Derek de Solla Price), you have engaged an elite sample. Maybe you should have titled your article “What do scientific authors want?”

3) It’s not true that universities cannot afford to subscribe to journals. University managements that once budgeted six percent of their spending on their research libraries claim “poverty” while lobbying for Federal money (as D.D. Eisenhower pointed out in 1960), harvesting increased surpluses / profits / savings, and often sitting on investments worth billions of dollars.

The lack of researchers’ concern with open access does not surprise me at all. But that is because they are shielded from the costs, which comes out of another’s pocket and not their own. It is much the same as paying for healthcare in the U.S.

If a researcher had to directly pay for access, if there was a meter for each paper accessed, then the level of concern would rise tremendously. Hide that cost and the issue seems to go away; make it apparent and you will get a very different reaction.

Here’s a nice quote from Harold Varmus: “Researchers spend hundreds of dollars of their N.I.H. awards on subscriptions to scientific journals.” — when they cannot get them from their libraries. [Chronicle of Higher Education, May 7, 1999]

Harold Varmus is hardly a disinterested observer on this topic. Did he provide any kind of backup for that (rather vague in any case) assertion? Was there relevant data available in 1999, and is there now?

When I speak with scientists, the gossip and the concerns are all about funding and jobs. Journals and access aren’t really on the radar, as noted in the re-post from last week (, the world of journals is somewhat peripheral to the actual job of being a scientist, which is about performing experiments and discovering things.

There’s also great value in keeping an ear to the ground, and continuing to spend time with researchers at all levels. As most publishers advance in their careers, we have less and less contact with those lower on the ladder. We’re hobnobbing with the elite, with journal editors and society presidents, with the established experts we want to write our books. The world in which the tenured elite move is very different from that of the graduate student at an average university. See this recent post about open access from a researcher striving for tenure ( for an example.

That lack of contact is perhaps a reason behind the “sky is falling” mentality we see so often in publishing. People who write great volumes online about publishing are generally activists championing a personal cause. These are usually outliers, but if they’re the only voices you’re hearing from the research community, it’s easy to assume they speak for the majority. That’s why we see panics over alleged crises in peer review and the like when the silent majority still firmly believes in the value of such things.

I suspect that the amount of concern about open access is going to vary considerably among scientists. In particular, I think such concern is strongly correlated with both age and distance to physics (or perhaps more accurately disciplines with LaTeX and/or ArXiv). Among mathematicians I think you’d hit someone with strong opinions on open access more like 1/4 or 1/5 and not 1/30.

These both make sense. If you’ve grown up with the internet it’s more frustrating when you can’t access papers at home or on your phone, and you’re less likely to be willing to actually walk to the library. If you’re in a LaTeX discipline you’re already doing most of the publishers work on your own, and if you’re in an ArXiv discipline then you have no regular interactions as a reader with journals. Furthermore, ArXiv does such a good job of automated document uploading and easy access that it makes all other publishers look terrible.

There’s also a lot of schools where the system for access to journals when you’re not physically present at the school is terrible. U.C. Berkeley had a great system (proxy server), but Columbia had an utterly terrible one (you had to go through specific Columbia access websites which meant that even if you could get access to a journal all hyperlinks in papers or databases were broken). Have you asked people specifically about access from home? That’s the “open access” issue that I think directly affects american scientists as readers.

Delurking in the interests of providing more “anecdata”…

The claim “if you’re in an arXiv discipline then you have no regular interactions as a reader with journals” may be true, though I think it is somewhat discipline-and-“tribe of research”-specific. However, what I often if not regularly need, is access to the collective wisdom on functional analysis and related areas which simply is not on the arXiv. And for that, the archives of JFA and JOT are invaluable. (Both journals have in total if not in ratio done more for my research than the arXiv, even though I am a strong supporter of the latter.) I appreciate that the priorities may be different in Noah’s own area of work and professional circles.

I don’t really care about access from home, but I imagine that this could be more of an issue if one had to work a lot from home for e.g. family reasons. Otherwise, I go to work, and occasionally I go to the goddarned library – I may even do it today to look something up from the 1980s which is not downloadable at my institution.

Among my own professional acquaintances in maths I think the ratio mentioned above is more like 1/10, personally. It’s either 1/22 or 1/23 in my department, FWIW.

It’s hard to work out a good representative sample, because different subfields and different institutions are going to have different rates of strong opinions. Berkeley graduate students are a particularly radical bunch, so even though I dialed down from my acquaintances I may still have overestimated. For some raw numbers, at Indiana out of 43 current faculty, 10 of us have signed the cost of knowledge boycott, and I think that’s a strict subset of the people with strong opinions on publishing.

My subfield is an arxiv early adopter (essentially because of Greg Kuperberg), which means I’m less likely to need to look at old journals than the average mathematician. I’d estimate that 95% of the papers that I read are through the arxiv. I go years at a time without seeing a physical journal. I also work well at night, so I work from home a lot.

A scientist wants to be significant. Every scientist we have interviewed in our research for Englue ( says that publishing meaningful results is the goal of their research. Most express extreme frustration at the tools available to search and synthesize the results of previous studies — especially those that had negative results. This frustration leads to scientists not wanting to read. There is simply too much noise in publishing. Open access is a powerful tool for readers (and scientists) if, and only if, it is easy to find what matters most to your particular interest.

In the field my company mostly publishes in, engineering, I don;t sense much pressure for OA from academic authors. Where the OA enthusiasm comes from is from the longstanding ‘we hate Elsevier’ crowd; technophiles, who think a better process is the same as a better result; and of course Elsevier, Springer, Wiley et al, who are now beginning to see the significant pots of gold on this particular rainbow. Authors, in my limited experience, see it as just another step on the academic treadmill, whereby something that was once pretty simple – sending a paper to a journal – is now made more complicated by having to fiddle about getting funding for publication too.

And I don’t see much enthusiasm from authors for manuscript submission systems. These are sold hard by the tech companies as ‘must haves’. But the people who love them are not publishers, who are, as everyone knows, mere parasites on the body academic and also loathe expenditure and the modern world in equal measure, but editors. In the wider world, lonely middle-aged men buy Porsches, convincing themselves that the Porsche will be the babe-magnet, and their paunch, their dandruff, their dullness and every other negative will no longer count for anything. In the similar spirit of hopeless delusion, academic editors pester their publishers into buying them online submission systems, convinced that the systems themselves will be paper-magnets, and all the other reasons around not enough and not good enough submissions will suddenly count for naught. I know of no-one who has submitted a paper to a journal out of love for its submission system. Of course there is some point to them for journals that do receive thousands upon thousands of submissions. But out of the 27,000 or so academic journals that are said to exist, I wonder how many are in that category?

As the Editor of my society’s journal, my job requires me to attend our three conferences each year and actually talk to authors. Although I follow a full suite of journal metrics, real face-to-face feedback is very helpful.
I concur with your observation that scientists’ primary interest is getting published. Although only 1/3 of our attendees are members with individual access to our journal, I rarely hear any complaints from the others about access problems.
One observation which was surprising was the poor reviews of manuscript systems. Maybe many of our authors remember our old email-based system, which was terrible. When we switched to one of the major systems in 2007, our time-to-first-decision dropped by seven weeks! I’ve heard little but good comments about the new way. Perhaps one reason is we have a Journal Administrator who is right on the ball helping any author with problems.

I wonder if your sample included scientists from state universities. State funding has declined dramatically over the years putting additional pressure on libraries and department budgets. My research (with a small sample size) suggests that there is not only high awareness of OA but that it is becoming a factor in discussions about where to publish.

It’s not surprising that scientists think of themselves as authors first; in many disciplines the published article is largely irrelevant to researchers working in the same field. Most information sharing at the leading edge is done via pre-prints, working papers, and less formal communications. Publishing is a way to gain credit for your work, but access to published information is not so important–you’re kept well informed by your network of colleagues. This varies by discipline, but I think is more common than many people realize.

Your view of a “scientist” is rather limited. There are many “scientists” who work in industry or independently and do not have access to the vast resources of a private/state funded university. These “working scientists” often do not publish, or publish in “grey literature” and yet they are avid readers and consumers of scholarly works.

There are also facultly who do have access problems, once again not at the top universities with vast budgets, but from your writing I doubt they are the ones you are talking too…

that said, many of your points ring true and I thank you for sharing them.

Most of my problems with access to journals are to the journals that only cost in the area of a couple of hundred dollars a year, but which my library can’t afford because the budget is so tight and we have to have the big deals from the major publishers. The papers in those journals can be just as important to my research as those in big-name journals, and most usually the ones in big-name journals have been released as a preprint 2-3 years before publication anyway, so access was never a problem—but not because of the library’s subscription.

I agree with much of this post, but was surprised that you completely miss the connection to open access at the end of your piece.

Given the incentives to publish, it is hardly surprising that most researchers view themselves as authors first. What is notable is that as most of their most relevant journals are subscription based, authors are not the market consumer to the publishers. The publishers sell to libraries, not to authors — so it is hardly surprising that the author services suck.

As you hit both of these observations on the head, I was entirely surprised that you didn’t mention how the economics of (Gold / APC) open access differ. In such a market, authors, not libraries, are the consumer, and thus the desires of authors, not libraries, drive services, innovation and price. Is it a surprise that open access journals like peerJ and Elife have much improved author-services over traditional models? (Or PLoS, as you mention?) If the entire market was driven by authors-as-consumer, would it not better respond to the actual desires of scientists (regardless of what they say or think those desires are – behavior always being better than words in predicting desires)?

Perhaps you missed this line in the post:

I speculate that traditional publishers, who spend much of their time trying to make librarians happy (even as they charge as much for their materials as they can get away with), are at a disadvantage with pure Gold open access publishers, whose only job is to make authors happy.

Indeed, my apologies for the oversight — though I was hoping to see this line of reasoning developed. Would you agree that this implies scientists might be more happy with a Gold open access world then most might realize?

It’s an interesting question. Likely an author-centric business will be more focused on pleasing authors and providing services directly for them. That’s where the economic pressures drive things.

But you also have to look at the reality of being a researcher, rather than the perception. One may think of oneself primarily as an author, but authoring a paper is a relatively rare event, at least compared with reading the literature. If you’re happy as an author a small amount of the time but unhappy as a reader most of the time, is that the right balance?

There are also a lot of other questions that need to go into the equation, particularly financial ones. Many of the scientists I speak with do indeed favor open access, but note that “I have no money to pay for it,” so imposing further drains on already tight research funds may also have an impact on happiness.

David, excellent points. It seems likely that the market incentives in author-pays world might help address both rather than hinder them.

With regards to readers – while an all-gold world would mean that publishers were author-centered, it would still mean that readers now have free and open access to all the literature. In addition to the obvious benefit for the reader, if readers still were not happy (for instance, because they needed filtering services) than such services could emerge (leveraging the free & open content). If excellent post-publication filters do not yet exist, I postulate it is because in the Library-as-consumer model that dominates, journals are playing the role filter satisfactorily for most of the readership dollars. It seems we agree an all-Gold world is a win for authors, but would it not be a win for readers as well?

Then there is the cost. Again, I postulate that the current absurdly high costs frequent (but not ubiquitous) in open access publishing are a product of the fact that the Library-as-consumer system is still dominant. If all journals were Gold open access and researchers really felt that the price was too high, the market would allow the rise of less-expensive options. Clearly that’s harder when as an author you have great alternatives where you can publish for free (even if it means crappy submission systems!)

Libraries are charged with subscribing to as close to “everything” as possible, so they don’t give much market signal of quality. Competing on price is hard when nondisclosure agreements hide the numbers. Large profit margins are an indication that competition, which drives innovation, isn’t tight. Gold OA isn’t the only solution to these (I think these problems didn’t exist as much when we had individual-based subscriptions to primarily society-managed journals), but I think it might be the path to restore an efficient and innovative market?

As noted in my comment below to Bjorn, it’s unclear if there is indeed a viable market for the sorts of post-publication filtering services you suggest. The best example now, F1000 does not turn a profit. In the current internet era, there’s an expectation that everything be free, so getting people to subscribe and pay for such a system may be difficult. Further, such services may be undermined by people who immediately copy their recommendations and make them freely available elsewhere, as happens so often in the digital world.

And yes, filtering is an important part of what journals do (see Michael Clarke’s re-post on this here:

As for costs, if your revenue stream consists of author charges, then the way you make ends meet is to either accept more articles, increase the price of those charges, or reduce costs, which in many cases means reducing quality. Journals like PLoS ONE, for example, don’t perform copyediting on articles as one way of helping keep costs down. Since you’re not bringing in any revenue from readers, they’re a likely place where you’re going to make spending cuts. Why invest in expensive semantic technologies and other reader services when they’re not your customers?

Also, Gold OA is certainly not a solution to a perceived problem of high profit margins. Hindawi, which exclusively employs Gold OA, has a higher profit margin than Elsevier. PLoS ONE is so profitable it is able to support all of the other PLOS journals which run at deficits.

Thanks for the reply. Note that we agree post-publication filters are not widely successful now because journals who market to readers naturally do this. After all, that’s what readers want and they are the market.

My thought experiment still stands — in an imaginary All-Gold world, would such filters be more successful? If yes, then reader stands nothing to lose. If no, then probably this is because the reader is happy enough and hence there still is no market. So I still postulate to you that readers, not just authors, would be quite happy in such a world.

Given the high profit margins of some Gold OA ventures, and the frequent complaints of the high cost of Gold OA, we might predict the Gold OA market ripe for some disruption with new players entering the market at reduced prices…. which of course is exactly what is happening — eLife, peerJ F1000Research, etc. I suspect that as the competition grows prices will continue to go down and competition will focus on innovations that both reduce costs and increase value, exactly what we see at these places.

You have a curious postulate about how publishers in an all-gold market would behave: raise costs and lower quality — surely that would just drive authors to the competition. Why would the economic equilibrium be other than the quality/cost balance sought by authors (with different journals offering alternatives for authors with different values of that ratio, just like any other market…)

You have a curious postulate about how publishers in an all-gold market would behave…

It’s less a postulate about how people would behave and more of a recognition of trying to turn a profit, or at least break even. If your revenue stream is based solely on author charges, then in order to make more revenue to break even/profit, you have to change something. It would seem to me the choices then are:
Publish more articles (and bring in more charge revenue)
Raise prices (and bring in more revenue)
Cut costs (so current revenue is enough to be profitable)
What other choices are there?

David, thanks for the new reply below (somehow I cannot get this to thread properly). I appreciate your insight and you’ve distilled the basic issues well:

Publish more articles (and bring in more charge revenue)
Raise prices (and bring in more revenue)
Cut costs (so current revenue is enough to be profitable)

What other choices are there?

These are the same choices faced by any business, e.g. auto manufacturers. These challenges drive innovation. Cutting costs doesn’t simply mean cutting quality: one could also innovate the manufacturing process. (PeerJ is a good example of this kind of innovation: )

Publishers, like auto manufacturers, would need to increase quality to avoid losing demand if they wanted to raise costs. They could also spend their profit margin on quality, investing in a better product and building the quality of the brand. Again, this usually involves innovation. Luxury brands introduce expensive features that then become ubiquitous.

Simply publishing more articles may come at a cost to the quality the publisher can afford to deliver, so it shouldn’t really be an option separate from cutting costs or raising prices. Of course the amount publishers can publish is up to the demand side.

As you observe, this All-gold situation forces action (and innovation) on behalf of the publishers. Eat into the profit margin innovating, or do nothing and lose customers to journals that offer the same quality at less cost, or the higher quality at the same cost.

Of course these pressures exist in the current subscription based system too — but that market is clearly less efficient and the competition less severe, not because academics are conservative or because publishers are conservative, but because it’s hard for a market to function when prices are secret and the consumer (libraries) care more about being comprehensive coverage then about almost anything else that goes into journal quality.

So yes, I agree that those are the choices. In a market that forces publishers to face those choices, we are not flooded with expensive, low-quality products any more than we are flooded by too many expensive low quality cars. Instead, we get choice driven by what the customer / scientist desires. Or where did I go wrong?

You’re right that it is all about finding the right balance, best meeting the researcher’s needs, cutting out unnecessary spending on things that don’t matter, maximizing revenues and minimizing costs while doing one’s best to maintain quality as best one can.

I’m not sure PeerJ provides much of an example of anything as of yet. It is wholly unproven, though it is backed with significant amounts of venture capital. Given the nature of VC funding, their goals may not be in line with those of a typical publisher. VC’s tend to work on shorter investment cycles, so many investors would be happy with a company that established itself strongly in the public eye and was then quickly sold off to an established publisher, rather than worrying about long term sustainability (see Mendeley as an example).

One other option that exists is consolidation. Gold OA works particularly well with economies of scale. Bigger publishing houses can cut costs by buying in bulk without simultaneously cutting quality. A publisher that puts out hundreds, if not thousands of journals can afford to have an extensive in-house staff and spread the costs across many journals, where smaller, independent journals have to carry a larger load to get the same level of quality. For example, Elsevier can create and run its own online platform and each journal only has to carry a small portion of those costs. For a smaller publisher, like PLOS, with fewer journals, each has to carry a higher percentage of the overall costs for their online hosting.

It’s one reason why many predict that a shift toward OA will further cement the power of the big existing publishers and may spell the end of the smaller, not-for-profit academic presses.

“It’s one reason why many predict that a shift toward OA will further cement the power of the big existing publishers and may spell the end of the smaller, not-for-profit academic presses.”

I am inclined towards this assessment (a gloomy one from my point of view) and I am always at a loss as to why this point so rarely comes up among the OA proponents in my discipline (mathematics). On some other forum or site, I may have used the phrase “gangsters can afford to be munificent”. [Note for lawyers and related entities: this is hyperbole for sake of illustration, and not accusation]

Only 15-20% of scientists in the United States have authored a peer-reviewed article, according to Tenopir and King. These percentages have been validated by other studies by Mabe et al internationally. I’d suggest that this survey and resulting list may reflect a minority opinion, skewed heavily toward producers of the literature, not the much larger pool of consumers of the literature. There is a convenience to asking authors as proxies for scientists, since publishers and librarians are much more likely to encounter them and get to know them. But we can’t lose sight of the fact that scientists who write papers are in the minority.

What do the majority of scientists want? We know from this a little about what “authors” want, but we may not know much more about what “scientists” want.

Interesting numbers. Further, Lotka’s law suggests that half of those authors write only one paper and perhaps 80% write just two or one. This implies that the regular authors are just a small percent of the scientists, perhaps 4% or less. Apparently publish or perish is not a major feature of science. This raises the question why this small group should pay for what the rest read? But there is also the question as to how many of the none-or-few-papers publishing scientists actually follow the literature all that closely?

I’m a tenured professor, writing about 1.4 peer-reviewed articles per year, reviewing about 2-3 manuscripts a month, editing for 2-3 journals and I have no clue about how many papers I read a year.

Being tenured, for me the most frustrating component of our system is that it is virtually impossible to stay on top of your field. Perhaps in fields where everybody publishes in 1-2 journals, this is still possible. In my field, if I were just to scan the ToCs/abstracts of all journals in which a relevant article might appear, I couldn’t do much else. On top of that, once I’ve localized a paper, I often still need to use #icanhazpdf on Twitter to get access to it, costing even more time and nerve (and that with a university that pays over USD 2m annually in subscriptions!).

The second most frustrating component actually has to do with authoring, but not with any of the aspects mentioned in this post: it is borderline impossible to publish everything important for replicating our work in one fell swoop. Ok, you may say this is a problem on part of the authoring system of the publishers, in this case, it was mentioned in the post. But I would like to simply archive my software used to generate and analyze the data as well as my data in a place where everybody can access it and then just write the paper referring to the data, using the software to visualize the data in the paper. The technology for this is pretty common and by no means hi-tech.
Instead, we still post our software on SourceForge or GitHub, our data on FigShare and our text with some publisher. That process completely destroys much of the work that went into the process to begin with, by making it hard to re-use, near impossible to replicate and needlessly inflates the time and nerve it takes to publish.

There are other aspects, but they have receded to lower priority with tenure – my postdocs and graduate students are more frustrated by the authoring process than I am these days 🙂

So, what do I as a scientist want?

1) As a reader, I want a smart system that assists me in filtering, sorting and discovering the relevant literature. That learns from my reading habits but isn’t constrained by them. It currently doesn’t exist, so I can’t complain that it is inadequate. Once I have located the relevant literature, I want single-click access to it, obviously.

2) As an author, I want to write my papers by combining our software and data with the text in a way that allows readers to visualize different aspects of the data than the ones I chose. I want to do this seamlessly with an infrastructure that supports, rather than hinders the process and a single-click decision of whether I want the manuscript to be public or not.

None of the above, despite being technically feasible without too much effort, is even on the horizon from publishers (possibly with some exceptions at F1000 and Frontiers), which is why I have found institutions (my own among them) who are working on providing the components for this infrastructure.

This just as an additional n to your sample size.

Keeping up with your field and finding the most relevant papers in science, which may not be in your field, are opposite goals so one cannot maximize both. Unfortunately, many people seem to be frustrated because they cannot do what is in fact impossible.

Interesting, where did I write that I want the same algorithm to serve me papers both inside and outside of my field? I don’t recall writing that and re-reading the text, I can’t find the passage you are referring to. I apologize, but English is only my fourth language, so perhaps I’ve expressed myself in the wrong way? Perhaps you could quote me?

Besides commenting on something I didn’t write or intended to imply, It’s an easy thing for the service that we’re now developing to have one RSS stream of incoming papers (likely inside field) filtered according to one set of criteria and the other (likely outside field) by a second set.

And guess what, this works not only for two, but even for three or more sets of information streams. In fact, I’m alpha testing some of these products one many different sub-fields of my interest, not just two or three.

It is quite amusing that I’m physically testing something that you claim to be impossible. 🙂

I was referring to your time not to algorithms. Your available attention is where your two goals compete. I am a cognitive scientist not a computer scientist, although I do develop search algorithms. But I also do research on this sort of cognitive scaling problem. If you tell me your field I will estimate how many papers you would have to read daily to keep up. I am pretty sure you would have no time left to look at papers that were more relevant to your work but outside your field, such as methodological advances. In fact it is probably many more papers than you can possibly read.

Ah, ok, but at least I get to decide which papers to read from a smart, personalized preselection, instead of stuff that people I don’t know have selected: ToCs. That’s precisely what I want. What’s wrong with that? Right now I barely have the time just to arrive at that list and then can’t read any of the papers. With that system, I get to the list faster and thus get the time to at least read some of them. Huge improvement over now! I would pay USD 100-200 a month for such a service out of my own pocket, if it cuts down my time searching only by 30%.

In that case I have an algorithm that may interest you, called the X-portal. It centers on a specific concept X, then ranks papers by cognitive closeness, which eliminates all the extraneous stuff. I developed it for DOE’s Office of Science, under the SBIR program, but they are not interested in it. In fact they terminated the topic of scientific communication.

But there is probably never going to be an algorithm that finds the n most relevant papers for you, because the metric is too vague to be operationalized. Not until we know a lot more about cognition anyway. (The real boundaries to AI are on the I side.) But a 30% reduction in search time, with better results, is certainly feasible. Things are already enormously better than they were just a decade ago, with a lot more to come.

Sounds interesting! Do you have a URL for that? If I search for it all kinds of stuff comes up…

Bjorn, you may be interested in today’s podcast, where Carol Tenopir talks about the overwhelming amounts of literature and researchers’ increasing time demands:

As for point 1), I’m not sure it’s as “technically feasible” as you propose. Looking at the best of breed recommendation engines, my Google search answers are filled with spam and cruft. Amazon’s recommendations for books I’d like to read are rarely accurate, and when they are, they’re often things that I’ve already read. I don’t use Netflix–are their recommendations any better? Semantic technologies are on the rise, but I’m not sure anything exists at the level you’re seeking, and where they are in use, they seem to be pretty expensive.

This is a reader-centric technology, and would require reader-centric business models in order to justify the likely large level of funding needed to develop it. If your business model is focused on pleasing authors, then there’s little economic motivation to spend to please readers. Services like this are perhaps more likely to come from subscription journals than gold OA journals for this reason (as an example, the common complaint about PLoS ONE is that it’s hard to browse and hard to find relevant articles amidst the flood of what’s published there).

One could see it as a service separate from any particular journal or publisher, and likely, for it to succeed, it would need to cover the entirety of the literature, not just the offerings from one journal or publisher. But it’s unclear if there’s any successful business model that would support the necessary investment and maintenance costs. F1000 is the closest thing I can think of and from what I’m told it has never turned a profit. It’s unclear whether people are willing to subscribe to things like this in an internet era where we expect everything to be free. One might then turn to an advertiser supported model, but then you run into the problems Google is facing, where your real customers are your advertisers, and your efforts turn to pleasing them, rather than your users and the service degrades.

Point 2) is also interesting, but seems more about authoring software, which comes into the picture a ways before the journals get involved. Again, this may be difficult for a publisher to justify investing in, and unclear if there’s a sufficient market still willing to pay for such software to make it a viable business. But data is certainly at the forefront of most publishers’ thinking, so should such tools come into use, I am certain that most journals will rapidly find ways to ensure compatibility.

Excellent catch! Indeed, F1000 is precisely one of the initiatives I’m working with. Their project isn’t RSS-based (it uses the PubMed API, IIRC), but works quite well already. Still needs a lot of work, but pretty decent proof of principle. Another prototype I’m testing is to use RSS (we’ve cloned a popular feed reader) and have plug-ins which can do all sorts of algorithms on the different RSS feeds and groups of feeds, and so on. Very early stage so far, but very promising as anybody can contribute plug-ins and everything will be fully open source. This latter project is funded by a library consortium here in Germany. This project will be a lot more generic than the F1000 tool is now. Both projects, at this stage, look quite nice and make me optimistic.
I don’t know the kind of semantic technology these projects are using, to be honest, I just provide them with the user perspective and test their different versions.

Obviously, these systems won’t completely take away my job of deciding what to read, but if they cut that work down by 30% it would be a huge gain! Right now I spend more time searching than actually reading, this is absurd and has to stop!

I’ll try to have a listen to the podcast, thanks for the pointer.

In general: of course, nothing of what “a scientist wants” has to be provided by publishers, nor would they be expected to offer it. But had they used their profits to make the life of scientists easier in the way I explained, probably not a lot of activism would have ensued: if you’re not frustrated, there’s no fuel for rage.

Publishers didn’t have to nor were they expected to agree on submission standards (text, data and software!) and incentivize a new market of scientific authoring software (which some of them might have even entered).
But they could have.

Publishers didn’t have to nor were they expected to provide standards and access in order to incentivize a whole new market of smart efficient tools to actually use the literature they produce (and some of them might have even entered).
But they could have.

Why didn’t they? I’m sure this isn’t accurate, but from the outside looking in, virtually everything a scientist reads from publishers and folks close to the industry makes it look as if the answer were: “we were too busy milking the cash cow and counting the dollars – and don’t you dare touch my money!” And that’s fuel for rage. If it’s far from the truth, the industry has to ask itself why this is the picture we scientists get.

I feel like we’re moving beyond the age of rage at this point, which is exciting, because we’re moving into a much more practical age of implementation and experimentation, probably a better use of all of our energy than shouting at one another.

I’m not sure it’s fair to characterize all publishers under one set of reasoning, and probably the ones most likely to understand and do the sorts of things you’re seeking are those least likely to be able to afford the risky technology and infrastructure investments necessary (the not-for-profits and academically-owned journals and presses).

But there’s also something to be said for those that are the most successful and profitable. Clearly they are doing something right, providing something the community desires, otherwise they wouldn’t have risen to where they are and wouldn’t be able to maintain their current position. If the Cell/Nature/Science journals are really so horrible, why are they so successful? I’d suggest that scholarly publishing is, in many ways, a reactive business, and the most profitable journals are those that have adapted best to providing exactly what scientists/academia asks them to provide (and charging the maximum that those same scientists are willing/able to pay).

Things may indeed be changing, but academia is extremely conservative and slow to make sweeping changes. It is likely that the same big players are going to be the ones to best adapt over time, given their history of accurately providing what researchers desire, and that they’re starting with a huge advantage of large amounts of capital to invest in experimentation without the potentially devastating results of failure as faced by the smaller, less well-funded enterprises. F1000 is a privately owned, for profit company. Nature’s Digital Science venture capital wing is another leader in developing these new sorts of technologies, as is Elsevier. Despite the rage, the new boss is likely to be much the same as the old boss.

I listened to much of the podcast and it was indeed very relevant! Thanks!

I hear Stewart on the Science podcast every week. Good to hear a familiar voice. Need to have a look at the others.

William D. Garvey (Communication: The Essence of Science, 1979) suggested that teams filtered scientific news and passed on what they found interesting. In his view “major” scientists were largely informally advised by their teams and by other “major” scientists, perhaps by telephone. My thoughts has always been 1) that gives “major” scientists time to figure out what frontiers to attack, and 2) that pretty soon you have the makings of a new journal seeking a publisher.

Social tools similar to what you cite is indeed one of several other components of the systems we’re currently testing!

Every scientist wanted to probe into previously unexplored areas of research.

Absolutely! Submission systems need to improve their user interfaces and (I believe) all the leading vendors are working to address this in upcoming releases.

It is also worth remembering that submission systems do a lot more than ingest files. I’ve listed some of the more obvious capabilities with respect to authors. (But, this is only the tip of the iceberg when you consider editor, reviewer and production requirements).

• Allow for journal-specific, customizable author instructional text
• Automatically check for plagiarism using CrossCheck
• Automatically link submitted manuscript bibliographies to PubMed and CrossRef
• Automatically format submitted manuscript bibliographies to journal reference style
• Contact co-authors to make sure they really are contributors
• Accommodate different submission workflows for different article types
• Automatically check image quality
• Normalize metadata to journal required standard (e.g. Title of X words)
• Toggle the author interface to: Chinese, Japanese, Portuguese, French, German…etc.
• Interface with the ORCID system to validate author IDs
• Resume interrupted submissions
• Selectively display manuscript status to author(s) based on manuscript policy
• Support submission of manuscript revisions with different workflow requirements and parameters
• Capture disclosure and other ethical information from author and co-authors with nested questionnaires
• Assign structured keywords to help with downstream reviewer selection
• Allow for suggestion of reviewers according to journal policy
• Flag duplicate submissions
• Allow linking/grouping of suggestions (e.g. letters to the editor)
• Email communications with authors using merge fields to populate manuscript metadata in email body
• Identify files not to be seen by reviewers
• Support easy entry of non-Roman and diacritic characters
• Support a wide range of submission formats including LaTeX, videos, data-sets, URLs, ArXiV, etc.
• Support ZIP file upload (compressed and uncompressed following upload)
• Automatically direct manuscripts to appropriate editors
• Provide a “mailbox” for manuscript-related author(s) correspondence
• Display reviewer comments and decision letters to author(s) based on journal policy (blind, double blind, open, etc.)
• Be available 24X365
• Offer author automated manuscript transfer capabilities to other journals following rejection
• Apply updates without significant service outage
• Allow author rebuttal workflow
• Support submission of previously rejected manuscript
• Support invited submissions
• Allow on-the fly re-configuration by journal offices to implement policy changes

I wouldn’t worry too much about getting Freshly Pressed – you get 1,500 views from it at most, all in one day, and about 80 spam comments. (I’m still aiming for the hat trick, though… the staff themselves make you feel special, even if the traffic doesn’t!)

But as a scientist, I am definitely more interested in open source publishing in the abstract – which is to say that I’m passionate about it, that I work in it, that I want to promote it and that I want it to spread – but when it comes down to being a very early-career, very young researcher trying to get her name out, I don’t have the capability to pay extra for open-access. And whenever possible, I want my papers to appeal to the gatekeepers – I don’t have the luxury of bumming around sending papers to charity journals during my career development years when I need to be aiming as high as possible.

In my mind, it maps directly to the fair trade food movement, and other opt-in consumer choices. As a moderately low-income young person, I often do not have the ability to purchase expensive fairtrade food. When I’m watching my food budget as closely as I do, I simply cannot vote with my wallet, even if I really appreciate the principles of fairtrade food. However, I’m a big advocate for fairtrade food, and feel that it is more responsible and ethical to purchase it.

If you look at scientists as consumers, it’s apparent that many of those who are interested in the idea of open-source access are those without the resources to support it. Of course, this is in my own experience and hasn’t been rigorously analyzed.

Comments are closed.