Google has been changing a great deal over the past few years — creating a parent company (Alphabet), changing its motto from “Don’t be evil” to “Do the right thing” (apparently without consulting Spike Lee), and carving off Waymo as its autonomous vehicle company. With revenues of more than $100 billion in 2017 and a valuation of more than $700 billion, Google has money to play with, a lot of talented employees, and tons of information about the world.
Google has also established the Google Brain Team, their dedicated AI group with the slogan, “Make machines intelligent. Improve people’s lives.” (Are they promising that? Or are they telling me to do that? I’m happy to do the latter, but not the former. Or perhaps AI [and Google’s marketing team] has a long way to go in the subtleties of rhetoric and punctuation.)
An initiative emanating out of Google Brain with the participation of YC Research, DeepMind, and OpenAI is the publication of what appears to be a “house” journal called Distill, which is ostensibly published by the Distill Working Group. Where this working group resides isn’t clear.
Distill is positioned as a research journal, claims to be peer-reviewed, and is indexed in Google Scholar. Their archival strategy consists of backups and the Internet Archive, but they claim they are thinking about adding LOCKSS to their archiving approach. They also use the fact that it has an ISSN to claim that it is registered with the Library of Congress, as if this grants some authority to their work rather than simply denoting a clerical step thousands of journals and magazines routinely complete. A more fulsome description of Distill reads:
Distill is an academic journal in the area of Machine Learning. The distinguishing trait of a Distill article is outstanding communication and a dedication to human understanding. Distill articles often, but not always, use interactive media.
One good test for whether your article is a fit for Distill is whether your collaborators and you are willing to put in whatever time is necessary to write and illustrate an outstanding article. In our experience, this often takes 100+ hours. Typically, we expect this to mean that there is at least one collaborator who is very enthusiastic about explaining things well.
Distill denotes issues by year and month, so its first issue published in September 2016 is Volume 1, Issue 9. Other issues follow this pattern of relative year followed by actual month of publication. It’s odd that the first paper is dated September 2016, while Distill was “announced” via a blog post on the Google AI Blog in March 2017. Michael Nielsen of Y Combinator, who serves as an advisor to Distill, announced it on the Y Combinator blog on the same day, as did OpenAI via their blog.
So far, there have been seven papers published, two commentaries, and one visualization. Only one paper has been published this year. Four were published in the final three months of 2016, five more in all of 2017, and so far only one in 2018. Momentum is not growing for Distill.
Most of the 10 articles were written by Google employees. Of the 26 authors who have published, 21 were Google employees (16 from Google Brain, three from Google Cloud, and two from other parts of Google). Only five authors were non-Google employees, with one of these (Michael Nielsen of Y Combinator, which is the parent of YC Research, a participating organization behind Distill) serving on the Distill advisory board. If this strikes you as incestuous, it gets better.
A total of 12 of the 26 authors in Distill have been editors of the journal (six articles were co-authored by Chris Olah, five by Shan Carter, and one by Arvind Satyanarayan), all of whom work at Google Brain as well. Arvind Satyanarayan became an editor at Distill in May 2018, two months after his paper was published there. He has been at Google Brain since 2017.
Yet, in a story about the launch, Olah and Carter claim “Distill is an independent organization,” a claim that seems to lack basis as there seems to be no record of it or the Distill Working Group as an independent entity with its own taxpayer ID or similar filings of establishment. The name “Distill Working Group” itself suggests it is exists within some other entity, with the most likely candidate being Google. In requests by others online for explanations and transparency, there is still no clear accounting of the exact relationship of Google to Distill and the Distill Working Group.
Review is done via GitHub issues, and then the repository is published when the paper is published.
Distill has some interesting aspects to it, most notably that review is done via GitHub issues, and then the repository is published when the paper is published. Given the recent announcement that Microsoft will acquire GitHub, it’s possible this might change. Regardless, it’s an intriguing way to conduct peer review. They appear to be revising their approach to peer review, posting a form online and tweeting an invitation to evaluate it. After three weeks, very few comments have been made.
In addition to providing a publication venue, Distill also has an annual awards system, where the authors of the best papers receive prize money based on evaluations from the advisory board. However, with nearly 50% of the authors serving as editors, it’s not clear how this could work in actuality. The initial prize was supposed to be awarded out of a prize endowment of $125,000 as of January 1, 2018, but no announcement has been made about any prize selection or award.
The shadow of Google over Distill can’t be missed, especially with the editors and most authors working at Google-related companies. The domain is owned by Google, as well. One author even referred to it recently as “Google Distill” in a laudatory blog post.
The exact motivation for creating this journal remains unclear. There is no shortage of journals dealing with machine learning and AI, published by organizations including IEEE, Elsevier, Wiley, SpringerNature, MIT Press, and ACM. In order to find out, I emailed the editors and an advisory board member, but none responded, another possible sign that they’ve lost their enthusiasm for the project. It’s tempting to write off Distill as a corporate vanity journal, despite some interesting elements.
But the corporation in this case matters, and that might explain why, despite plenty of involvement and circumstantial evidence, is Google making no claim for Distill. To embrace the journal and acknowledge its parentage could make Google a media company rather than a platform, opening them up to the kinds of liabilities publishers typically deal with. Google has been accused of being a publisher before, and has sought to dismiss this perception because it could be costly to them in a number of ways — liabilities, requirements, and oversight being just the first three that come to mind. We should expect Google, Facebook, Twitter, and others to continue their efforts to avoid being labeled media companies, a position that NYU professor Scott Galloway called out in an interview with Kara Swisher via her excellent Recode Decode podcast:
“We’re a platform, not a media company.” No you’re not. You run content, you run advertising against it. Boom, congratulations, you’re a media company. You have some onus of the wonderful things that come along with being a media company, including 90 percent gross margins, influence of unbelievable magnitude, but there is a level of responsibility and, wow, have they let us down.
Is Distill an academic research journal? In certain respects, yes. Are there conflicts of interest in its authorship, editorship, and ownership elements? It looks like it, which means the answer is yes. Is Google now a media company? Many would assert, myself included, that it has been for a long time and is skating by on a technicality perpetuated by out-of-touch regulators and legislators in the US.
It makes sense that a journal like this might emerge where there is a critical mass of AI professionals at work. However, to do it traditionally and with more legitimacy, Distill would be well-served by extracting itself from the company where its editors work, which would allow this potentially interesting journal to publish in a more legitimate and wide-ranging manner. It would also make it look less like a pet project and more like something worth our attention. Adding an editorial board, a management layer, and more of the things publishers do would only help if they are serious about having a true academic journal.
Of course, it should not be lost on us that the traditional mechanisms of publishing — indexing in ISI and MEDLINE, registration with the Library of Congress for an ISSN, and so forth — all attain a whiff of obsolescence when Google is doing the publishing. Just indexing a Google journal in Google Scholar might be enough for all practical purposes. With a market position like theirs, why do they need any of our infrastructure? In the photo accompanying this post, when it comes to infrastructure and publishing power, are we the shoes or are we the legs?
On a deeper level than infrastructure and the mechanics of publishing, however, this also demonstrates that valid publishing goes beyond infrastructure or market power. Publishing is not a button. It requires independence from the source for a disinterested evaluation. It requires comparable expertise and systems to extend and manage into related expertise so that valid editorial and peer review can occur. Publishing is not just about bits and bytes, indexing and discovery. In a world of vanity publishing, fake news, cyberwarfare, and propaganda, the more subtle and intangible values encapsulated within a strong and independent publishing system are perhaps more important than ever.
(HT to DS for suggesting this as a topic for a post, and for help getting started.)