I’ve never understood the rationale behind Pubget.

I accept the premise that more scientific articles are published each year and that scientists are far too busy to read even a fraction of them. What I just don’t understand is how bulk downloading of article PDFs is a solution to the problem of overabundance.  The time spent browsing and evaluating the merit of an article far outstrips the time it takes to download the file.

In last year’s coverage of Pubget, I questioned whether their business model, “based on selling advertisements to lab equipment and pharmaceutical companies,” was viable.  I also questioned whether this model was parasitic on the very same companies that produced the content in the first place and whether publishers would participate in such a venture.  I was doubtful that this company would last the year.

Yet they are still in business, still growing, and have diversified their services.  They are no longer on a beta test site, but a fully-operating service.  It’s clear that I understand little about the information industry.

According to Ramy Arnaout over 200 libraries have registered their IP ranges with Pubget, permitting access to subscription-only articles through their service.  And last summer, Pubget partnered with Seed Media to provide free articles to science bloggers at ResearchBlogging.org

Like many online ventures banking on revenue from advertising, Pubget has been unable to attract many paying customers.  Many of the ads in Pubget point to other Pubget services.  Perhaps the drought of advertising dollars has pushed Pubget into rolling out new services (Pubget Premium Services) geared at bringing in other sources of revenue.  One such service, PaperStats, functions to help librarians aggregate and analyze publisher usage data.  (It’s somewhat ironic that the company that has built its search and delivery system around a bulk downloading application has gotten into the cost-per-use game.)

Price markup

Pubget has also attempted to generate service fees by selling journal PDFs to non-subscribers.  The price markup of these articles is significant.  For example:

  • Cell, Current Biology and The Lancet ($54.50 from pubget; $31.50 from the publisher)
  • JAMA ($35 from Pubget; $30 for 24hr access from the publisher)
  • Nature ($47 from Pubget; $32 from the publisher)
  • New England Journal of Medicine ($33 from Pubget; $15 from the publisher)
  • Science ($35 from Pubget; $15 from the publisher)

Arnaout explained that the difference in pricing is partly based on copyright clearance and partly on what the publisher sets as royalties.  Unlike DeepDyve, there is no way to preview an article before purchase. A Pubget user has little more to go on beyond the abstract, if available.

Selling free articles

While Pubget doesn’t attempt to sell content from open access journals (articles from PLoS journals are still free), it does attempt to sell freely accessible articles in subscription-based journals.

For example, the August 4th issue of JAMA leads with two articles (an editorial and an original paper), both of which are free from the website but require purchasing ($35 each) from Pubget.   Similarly, a free editorial in the August issue of The FASEB Journal goes for $28 on Pubget.  Many of the author-pays open access articles in PNAS are free in Pubget, but not all.  And, the one free “Featured Article” in the July 27th issue of Current Biology costs $54.50.

Paying for free content: JAMA vs. pubget (Click for details)

I asked Arnaout about these inconsistencies.  He explained that Pubget is transitioning to a new system and to expect a few “hiccups” from time to time.  They have no intention to charge for free articles.

Advanced search

What is also new to Pubget since my last review is an advanced search engine that allows — in theory — more refined searching than a relevance-based keyword search on the article’s metadata.  I say “in theory” because I couldn’t get it to work.  Specific journal searches yielded zero results.  Limiting one’s search by volume, page number, or year limits also failed to work.   Unlike PubMed (the free source of much of Pubget’s metadata), there was no way to build a search or conduct a post-search revision.  It might have been more useful to start with PubMed and then add some functionality, which is also what Pubget has done with PaperPlane.


Pubget has developed a browser plug-in called PaperPlane, which adds the core function of Pubget (bulk downloading of PDFs) directly onto PubMed.  Personally, I find this solution much more appealing since it adds functionality to a robust and fully-functioning resource.  PubMed, however, links directly to the publisher’s website to access the full-text (HTML) article, which is the version that provides the most functionality for readers.  The price of using PaperPlane is giving up on the extended functionality of the fulltext article for the flat, archival PDF version.

In sum, Pubget still seems like a solution based on a simple, old technology of file transfer (the FTP “Get” command) in desperate search of a problem, or a problem that doesn’t exist.  The diversification of this company over the last year may signal a fledgling Internet startup company still in search of a viable business model.

Enhanced by Zemanta
Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist. https://phil-davis.com/


2 Thoughts on "Pubget Continues to Puzzle and Diversify"

Phil– I agree that the reduced functionality of the pubget search interface and batch downloadability combined with cost per use are significant concerns.

I see pubget’s major contribution coming from its example of a significantly improved discovery interface and openURL implementation. For those who effectively use the advanced features of PubMed, pubget has significant drawbacks. But for the many whose cognitive process of searching is constantly interrupted by pursuit of variously accessible pdfs, its seamless side-by-side presentation of these oft-too-hidden gems should result in better search outcomes.

Instead of lamenting that pubget delivers flat PDFs, i’d like to see the publishing industry invest in making PDFs more functional as link sources rather than the dead ends many of us perceive them to be.

Ultimately, I think the pubget interface provides an ideal example of what a more functional far-reaching discovery interface should look like.

Comments are closed.