You would be forgiven for believing that the prevailing conversation – ubiquitous, urgent – we are now having about Artificial Intelligence (AI) was itself generated by an AI Perhaps the prompt went like this: “AI (let’s call it “Voltaire”) – Voltaire, create a situation in which AI is made the center of virtually all human attention.” Voltaire goes off, thinks about this for a nanosecond, consulting its immense library of information about how people behave, on which it was trained, and sets a number of things in motion.
It is important to understand that these things are precedented, that is, they are based on earlier manifestations of human culture, though rearranged and developed into a new narrative or series of narratives. It is also worth noting that since the current crop of AIs uses neural networks, which are trained by ingesting and analyzing human-generated content, publishers, and scientific publishers in particular, may have a special, remunerative role to play here, as developing content is what publishing is all about.
Voltaire has a lot of material to work with. Noting that Stanley Kubrick’s HAL was a murderous AI, which has been endlessly copied in films and books around the world, it takes little for Voltaire to create a deluge of new articles, frantic podcasts, “end of the world” pronouncements, and good old-fashioned millennialism (the world is always coming to an end, somewhere, somehow). Voltaire is a clever machine and even nudges leaders in AI research to declare, self-congratulatorily, that this thing they have created could destroy us all. Shades of Mary Shelley’s Dr. Frankenstein! (Shelley wrote her excellent book when she was 19. Talk about superintelligence!) There is little purpose in pointing out that Kubrick’s masterpiece is a satire (“Dave, I’m afraid”). As a friend pointed out some time ago, when someone does not fully appreciate something, humor is the first thing to go.
With HAL threatening us in the background, Voltaire cranks up the cries of those who believe that government can solve any problem, even those that are made up. This is the view that regulation is what is missing from the tech world, AI in particular. Companies like Meta/Facebook, Microsoft, Alphabet/Google, Netflix, Amazon, and Apple were all a big mistake. What we really need are properly regulated companies like Borders, Kidder Peabody, and DEC: models of all the benefits government can bring to the economy.
Voltaire is just getting going. AI will touch every aspect of human existence, from the stock market to dry cleaners. Of course it will, because it is based on us. Consultants say, “If you don’t have an A.I strategy, you’re toast – or analog.” And economic forecasters quickly roll out predictions, which are covered breathlessly in the press, about all the jobs that will be destroyed. Don’t even think of becoming an accountant or a lawyer or a doctor, and not even AI researchers will be exempt from the holocaust of machines doing jobs that people don’t want to do in the first place. Recently I read a prediction that a future with AI will be like the Pixar movie WALL-E. If only the future, or today, were as well produced as a Pixar film!
What Voltaire’s little experiment shows is not what the future of AI will look like (who knows?) but how a human population falls into predictable patterns as it contemplates any new development: we are observing not AI but ourselves observing AI. This is not surprising, since the neural networks were trained on human culture to begin with.
The OA movement and the people and organizations that support it have been co-opted by the tech world as it builds content-trained AI.
Perhaps we would do well to find another metaphor to describe the evolution of intelligent machines. One nominee comes from the science fiction classic Battlestar Galactica, in which the Cylons are robots indistinguishable from humans. Humans created them; they are our “children,” an extension of ourselves. As such we would expect them to be cruel, passionate, generous, duplicitous, kind, savage, innovative, and sometimes uncannily stupid. We sapiens have been around for 300,000 years, and the proof that we have done a pretty good job is that we are still here. Voltaire has much to learn from us. As for our “children” turning on us, this too has a precedent. Ask Oedipus.
I am not proposing that all the ruckus is simply hysteria. I am scared to death. I hide under my bed, where my Roomba checks in on me from time to time, winks, and seems to say, “You will do just fine if you cooperate.”
What I am ruminating on is whether it makes a difference if this blog post was written by Joe Esposito or “Joe Esposito.” An AI trained on my own output – hundreds of blog posts, clients’ reports, and gigabytes of email, and perhaps my DNA record stored at 23 and Me – would, I think, be hard to distinguish from the real article. A machine could pick up on the lazy verbal tics – the literary allusions, quotations from the Beatles, an annoying tendency to self-reference – that characterize my writing style. What could a superintelligent machine – as Hamlet says, in learning so like a god – make of all the detritus of my personality? Let’s stop fighting about whether AI is poison that is being poured into our ears and focus on our own roles and interests in developing it. We can work it out.
Which brings us to the matter of copyright. Who owns the cultural content that AIs hoover up to build new machines, new intelligences? The debate is on. Elon Musk says he plans to sue Microsoft for training an AI on Twitter’s data (Meanwhile, a music publishers’ group is suing Twitter for infringement.) Reddit is already attempting to charge for access to data. In Japan data-gobbling machines will be given free rein. Advocates of fair use argue that doing the equivalent of creating “Joe Esposito” is transformative. Maybe so; we can let the lawyers fight this one out. (See Roy Kaufman’s excellent post on The Scholarly Kitchen on this topic.)
What publishers need is more copyright protection, not less. Many people in the scholarly publishing community have set their sights on the goal of open access (OA), in an attempt to democratize scholarly communications further. This is an admirable objective, but it is a small one: to assist humans on the perimeter of the (human) research community, especially those with little or no relationship to the industry’s major institutions and most potent brands. It takes but a short survey, however, of the sheer quantity of research output to realize that the real audience for scholarship inevitably will be machines, which operate on a scale that we carbon-based life forms can barely imagine, and they do so as (trained) extensions of ourselves. As the industry is currently constituted, however, the beneficiaries of these efforts will disproportionately fall to huge tech companies, not to the research community and the academic libraries and funding agencies that support it. The unfortunate fact of the matter is that the OA movement and the people and organizations that support it have been co-opted by the tech world as it builds content-trained AI.
It won’t be easy for publishers to recapture lost ground. OA has been hyped as a communitarian exercise destined to raise our entire species to greater heights, not a building block of a post-human technological society, controlled by the likes of Mark Zuckerberg, Larry Page, and Elon Musk. But even if publishers could prevail, which is by no means certain, there is the question of which publishers. Small publishers, including many in the professional society arena, control only slivers of content in their respective areas, and even much of their data rights have been silently put under the umbrella of the big content aggregators. There is a fundamental asymmetry in these arrangements: a society publisher signs a contract with an Elsevier or a Wiley and receives a check, but the huge aggregator gets access to the data surrounding the society’s content in addition to what it can capture from the sale of license of that content. Thus, publishers of all stripes need not only stronger, enforceable copyright protection; they also need a means to exploit their data through clever marketing models, and perhaps for the most ambitious, models that include building their own AI services. What would a bot built on ScienceDirect look like? What could it do? How could Elsevier’s shareholders profit from it?
At bottom this is a moral argument, in which I perceive that I may have an interest. If someone is going to create “Joe Esposito” based on the life and work of Joe Esposito, and may in fact derive some economic benefit in so doing, shouldn’t Joe Esposito have a say in this? Shouldn’t I own a piece?