Database Publishing as 23andme Does It

[Editor’s note: this is the second part of a two-part post. The first part can be found here.]

In Part One of this post I described the consumer-facing activity of 23andMe. To summarize: 23andMe offers a service to consumers who submit DNA (through saliva) for analysis. 23andMe charges a fee for this, though it may not cover the cost of soliciting the users and performing the DNA tests. Consumers are then provided with a report on what their DNA tells about them: Are you likely to come down with certain medical conditions? Do you have some hereditary traits that are not medically significant? What does your DNA tell you about your lineage? Inasmuch as this consumer service is not likely to be very profitable (high costs, competition that pushes prices down), 23andMe has another revenue stream. That service is data publishing: making the data collected from consumers and analyzed in 23andMe’s lab available to organizations that understand and can exploit the value of a large database of genetic information.

Thus consumers help (even pay) to create the 23andMe genetic database, and the company then markets that database to (for example) pharmaceutical companies. The consumer side of the business may lose money, but the high-value sale to pharma could be as profitable as the discovery of oil on the acreage owned by Jed Clampett on Beverly Hillbillies. Data, after all, is the new oil.

Recently an article appeared in Gizmodo that stated 23andMe’s strategy clearly. Here is an excerpt:

When consumers take 23andMe’s test, they are presented with the option of having their data (anonymously) used for research. 23andMe then uses this data for its own research, as well as selling it to third-party partners. More than 80 percent of customers consent, according to the company. In the past, people have raised concerns that 23andMe’s ambitions may violate consumers’ genetic privacy, speculating that the company’s long-term goals include selling your data for advertising purposes. Consumers are forking over $199 to 23andMe when the data itself is already a cash cow. . . Already, 23andMe has published more than 80 peer-reviewed studies based on customer data. The average customer, the company says, contributes to 200 different research studies.

(I would like to see a bibliometric analysis of where those 80 articles appeared and how they have performed.)

I am not so sure that 23andMe is already a cash cow, though (if they are, why is it raising $200 million?), but I am personally not concerned about being ripped off as a consumer; after all, no one from 23andMe held a gun at my head to force me to spit into a tube and fill out a medical questionnaire. What concerns me is that here is another new publishing business created by people who are wholly outside the industry. How could that happen?

Let’s review how the business works. (I have earlier written about this topic in general terms on the Kitchen.) 23andMe wanted to create a large database of genetic information. This is hard to do. It’s not possible (in the U.S., at any rate) to walk into a K-12 classroom and insist that all the kids spit into a tube — though once they started, you might have a hard time stopping them. Nor can you gather up information from hospitals or companies. In the current regulatory environment DNA can only be collected on the level of the individual, who must grant permission. Thus to build the database, a company must be clever, which 23andMe assuredly is. After building the testing kits and setting up a lab to do the DNA analysis, 23andMe had to develop consumer marketing capability.

The consumer testing, in other words, is a clever feint. 23andMe provides just enough information to its users to get them to pay to participate in the service. The kits themselves are distributed widely, but anyone who purchases one is then put into a direct relationship with the company. The exclusivity of that relationship is never violated, both for reasons of privacy and because to do so would undermine 23andMe’s business interests. 23andMe is thus at least in part a D2C company — direct to consumer. This is true as well of the Four Horsemen of the Internet: Apple, Google, Facebook, and Amazon. There is a lesson here for STM publishers whose knowledge of end-users is fitful at best: such publishers put the library at the center of their plans, but the library is an obstacle to garnering information about consumers. Thus publishers must simultaneously lock up current revenues from libraries even as libraries are disintermediated for future revenue streams.

There are some interesting implications about this business model. First, 23andMe will be charging for its data on the basis of that data’s economic worth. The moral case for this is strong: publishers should share in the economic growth that they stimulate, should they not? It may be uncomfortable for publishers to press this argument, however, as it is the same argument that is put forward when researchers object to writing and reviewing for journals without direct compensation. After all, shouldn’t the researchers own a piece of Elsevier’s action?

In the traditional model, where subscriptions are sold for a pittance to libraries, the price is mostly based on the cost of production, not the value of the content.

Putting aside the amusing hypocrisy in which publishers will find themselves, it is worth noting that pricing content for economic value (Stewart Brand: “Information wants to be expensive”) would be a radical change for most STM publishers. In the traditional model, where subscriptions are sold for a pittance to libraries, the price is mostly based on the cost of production, not the value of the content. This is why we often discuss journal economics in terms of the cost of an article, insufficiently distinguishing between an article on Chaucer and an article on genetic engineering that could, literally, alter humankind and net a stunning profit for the businesses that use that article. (I have nothing against Chaucer, but…) Libraries, in other words, have been exploiting publishers economically for years. It’s good to have 23andme come along and stick up for the economic rights of the purveyors of content.

A corollary to this is that as high-value data publishing becomes more common, an increasingly large slice of publications will not be available to researchers working in the academy. That is because these data publishers will naturally seek the highest return on their investment, which will lead them to develop corporate and government sales channels. (How would the CIA use a genetic database and how much would they be willing to pay for it?) The economics of data publishing in certain fields, in other words, will lead away from the public goods engendered in universities to the proprietary interests of elite for-profit technology firms. This is a growing, if not new, trend, as The Wall Street Journal reported last year concerning artificial intelligence research.

The question that nags me is, Why didn’t Elsevier and its cohorts get to this first? After all, they talk to thousands of scientists every day. The DNA tests could be outsourced to a third party (surely Ancestry.com, 23andMe’s principal competitor, is not running its own lab), and developing a D2C business is mostly a matter of hiring a very talented person and giving him or her three years of funding to make something happen. John Wiley, Springer Nature, Taylor & Francis: all these companies (not to mention several others) all have the balance sheets to support this kind of activity. It’s not a matter of money; it’s a question of publishing paradigm.

This brings me back to the one ironclad law of innovation: If you want to do something truly new, you have to stop talking to your customers. When that customer is the library, you come up with great things for libraries (though librarians don’t always agree). But libraries are, at best, a static market today; to get into a growing market, it is essential to look elsewhere. Innovation, in other words, is not only about developing a new product or service; it is also, and as importantly, a matter of finding a new category of customer. Markets are not served; they are created. The lesson of 23andMe is not that human genetics is a growing field (of course it is) or that Big Data can reveal important patterns that can be introduced to the marketplace at significant profit; the lesson is that publishing is growing in many unexpected ways, despite the hand-wringing about tight library budgets. For a management team, growth is a matter of choice.

Joseph Esposito

@josephjesposito

Joe Esposito is a management consultant for the publishing and digital services industries. Joe focuses on organizational strategy and new business development. He is active in both the for-profit and not-for-profit areas.

Discussion

4 Thoughts on "Publishing the 23andMe Way, Part Two: The New Data Publishing Business"

Haven’t done a full analysis over my morning coffee – after a very quick poke at Web of Science of the 83 papers; 8 were classed as ‘highly cited’ (in the top 1% of papers in their subject area) and 1 is a ‘hot paper’ (in top 0.1%). 23andMe seem to be keen on NPG with 11 papers in Nature Comms and 10 in Nature Genetics.

By James Hardcastle
Sep 26, 2017, 6:42 AM

University X has a number of research groups that produce data. The savvy office of research places restrictions on when and how such material can be released and how. Eventually these are screened, kept proprietary or selectively available to 3rd parties under a variety of rubrics, including through scholarly journals. The value of this knowledge depends on who sees value as a single source or aggregated from a variety of sources, for a specific need or for aggregating for a variety of purposes.

How does a journal or journal publisher that has access to such data when and if it is fully released and when the research institution may have extracted and fully exploited the value. It seems like the journals get the pickings after the field has been harvested and the remnants gleaned.

By tom abeles
Sep 26, 2017, 11:31 AM

You’re assuming that a journal would ever come into play in such a situation. So much of the research done by industry is never published, so if a researcher at a university created a proprietary database that brought in an enormous amount of revenue to that university through exclusive licensing, would they ever necessarily publish a paper on it?

By David Crotty
Sep 26, 2017, 11:44 AM

I agree. It seems that the posting suggested that the journal publishing industry was leaving revenue on the table by focusing on its current market, libraries, as the least profitable path and should seek a vehicle for monetizing its assets differently.

A parallel issue is seen in the move to stream articles upon review and acceptance for publication which changes the nature and value of the compilation called a “journal” since, today, researchers, particularly with sophisticated engines search for “information” whether the article on left-handed monkey wrenches appears in the journal of wrench studies or the sociology of left-handed mechanics. That does not abrogate the value of a “journal” but changes the system.

By tom abeles
Sep 26, 2017, 12:49 PM

The Scholarly Kitchen

Publishing the 23andMe Way, Part Two: The New Data Publishing Business

Joseph Esposito

Discussion

Back by Popular Demand – SSP’s Journals Academy!

Mythbusting Educational Session Proposals for SSP 2025

Catch SSP for a MeetUp and Panel at the Charleston Conference!

Joseph Esposito

Related Articles:

Next Article: