The acquisition of Meta by the Chan Zuckerberg Initiative (CZI) promises to transform scientific investigation. As a byproduct of this, it will likely transform scientific publishing as well. I am on the Board of Directors of Meta, and I must say that the path to the company’s new ownership did not entirely come as a surprise. People have been sniffing around the company for the past two years as interest in data and data science has exploded. By working with the scientific literature, Meta has found a new way to expedite research and to bring added value to the underlying publications.
For those unfamiliar with CZI, please take a look at its Facebook page. The “Z” of CZI is, of course, Mark Zuckerberg, which means that some of the technological chops and capital of one of the world’s most successful companies is now being brought to bear on the tiny little world we know as STM publishing. As far as I can tell, we attendees of PSP, SSP, etc. haven’t inspired that level of interest before. Microsoft has been working on academic search, but its efforts have yet to be brought fully into the marketplace. Google Scholar, of course, has transformed discovery in scholarly communications, but it still seems like a half-hearted attempt on Google’s part. What would it mean, for example, if Google put the same level of resources into Google Scholar as it does into Gmail or YouTube? Apple is the world’s greatest consumer products company, but its penetration of STM, such as it is, is merely a byproduct of its focus on the fashionable. Amazon has altered academic books forever, but is largely absent from the world of journals. No, this step by CZI represents something different. It’s an odd feeling to wake up one morning to find that STM publishing will now be measured against a new scale.
A clue to Meta’s appeal to publishers came about when Sam Molyneux, the cofounder and CEO (he founded the company with his sister Amy, an IT professional), met with various publishers over the past three years. The reasons for the meetings were twofold: first, to license access to the publishers’ scientific content for Meta’s text- and data-mining; and second, to present Meta’s tools for publishers. Some of the meetings had a humorous edge. When Sam claimed that Meta could analyze an article and predict how many citations it would garner in three years (with the stipulation that a human editor had to review the article as well to determine if it was in fact good science — see the white paper on this), one member of the publishing audience broke into a laugh. But when the test was successfully completed, with results superior to Sam’s original claims, that publisher agreed to negotiate a license. It’s one thing to acknowledge in the abstract that the robots are coming, quite another thing to understand that they are coming after you.
The meetings with some of the larger publishers took a curious course. The first meeting was typically with one individual charged with licensing data. But Sam would get called back, and in the second meeting there would be some higher-ups as well. Then there would be yet another meeting, with someone more senior yet. “What is going on here?” Sam called me to ask. It was apparent that Meta had caught the publishing organization’s attention and that as the level of interest rose (as measured by moving up through the company organization chart), the possibility of an acquisition discussion rose, too.
The escalation of interest revealed something else as well. Although everyone in the STM world is talking about data, data, data, the fact is that most manipulations of data within publishing companies are still at a very early stage. This is another instance of the truism that what is inside a publication is usually a lot smarter than the business processes surrounding it. What Meta was bumping into was the growing awareness among publishers that data science represents a next stage for the evolution of STM publishing. It is an open question how publishers will pursue this and what resources must be brought to bear on the opportunity — and that leads to the question of which companies since not all publishers have the same resources.
A next stage is not a competing stage, however, which brings us to one of Meta’s distinguishing characteristics: Unlike almost all technology companies working in or near the publishing business, Meta did not set out to disrupt publishers, nor is it interested in competing with what publishers do. This is not Academia.edu or ResearchGate, nor, god help us, Sci-Hub. A critical decision made early in the company’s history was never to display the text of a scientific paper, which meant that Meta was not and is not seeking to substitute for publishers’ products. Meta adds value to publishers’ material through its algorithms that find patterns in “Big Data” and can make predictions based on those patterns. I submit that this is a test of innovation in publishing and for the many upstarts that seek to “disrupt” the established publishers: if the alleged “innovator” is still in the business of selling content, the degree of innovation is very small.
In its new incarnation Meta is still very much a part of the STM community, but it has nudged closer to the researchers themselves, who will look to Meta as a new method of discovery and an accelerant of the research process itself. And to my friends on the publishing side who are already wrestling with the sheer pace of digital change, I say, fasten your seat belts. Things have just gotten more interesting, and faster.