I’ve been playing with a physical user interface for a cloud based artificial intelligence (AI). The AI system in question is Alexa and she’s quite an interesting character whose behavior gives one much to ponder. For those not familiar with such things, Amazon’s Alexa is (like Google’s Home) a small cylindrical tube that contains a small PC, a speaker, a slew of microphones, some LEDs and, with an internet connection, a link to a cloud based AI that aims to supply answers and carry out various functions in response to spoken commands. You can go and find reviews of these things in any reputable tech site (I recommend Ars Technica); this isn’t a review so much as some open thoughts on what importance these things might have in a scholarly publishing context. She’s very easy to set up. She? well the UK version comes with just the one voice, a very pleasant, soft received pronunciation accent. Now on the other side of the pond, you have had Alexa for a while, as well as Google’s Home device — but over here, Alexa is still pretty new (more on some early consequences of that later) and Google Home is just about to launch.

Image of Alexa 'smiling'
This is Alexa, ready and waiting to act on your commands

So what can she do? All sorts. There’s a slew of what Amazon calls ‘skills’ that you can give Alexa — things that enhance her capabilities. I have The Guardian ‘skill’ so I can ask Alexa to give a rundown of headlines etc., and then read out anything I’m interested in. I also have the Radioplayer ‘skill’ (I think this is UK only) that allows me to listen to any BBC or commercial radio station (including the local ones) simply by asking her to play the station name. I’ve hooked up my Spotify account to allow for more musical opportunities; again she is remarkably good at picking out the names of bands or albums and then filling the air with the appropriate sounds.

Oh yes, I can also turn on/off the lights in my shed with my voice, from inside the house. This is a level of awesome way beyond those few words. You think nothing of it? Well you’ll change your mind when you stagger out to the shed loaded up with power tools and whatnot, and it’s dark and raining and out there somewhere between you and the shed is something awful that the dog has left on the ground…

“Alexa, turn on the shed lights (please)”

“Ok” (And then there was light, and it saved my shoes from that nasty mess the dog left).

Yes, Alexa can act as a smart hub for Internet of Things (IoT) devices; in this case a ‘smart Wi-Fi plug’. It took 15 minutes to set this up and easily two thirds of the work was figuring out how to connect the plug to the home network. I’ve bought 2 more plugs already.

Alexa is simply brimming with facts. To check this out I introduced a smallish boy to the test environment and watched what happened next. Turns out my boy has many questions. Questions he (correctly) doesn’t think his Mum or Dad can answer. So he was most impressed when Alexa gave him the distance between Earth and Pluto in both Km and Miles, and even more impressed when she told the distance in miles and light years from Earth to the center of the Milky Way. He then asked her to tell him a joke.

15 minutes later… She finally repeated herself.

My son, pretty much immediately anthropomorphized the device. You’ll notice that I’ve done similar in writing about it. The voice conversation is so natural and remarkably fluid, that it seems wrong not to think of it in terms of a personality. This is striking. This is genuinely a new metaphor for human machine interaction. Yes, there’s been voice control before (it was in Windows 7 among other places — but it didn’t work very well). You can speak to your cell phone via Siri or whatever Google is calling their assistant these days, but the ambient usability of Alexa just works. The fact that it’s a cylinder that sits in your house somewhere is also interesting. I’m happier for Alexa to sit there with her microphone ready and waiting for me to speak, than I am for Cortana, the Windows 10 assistant, to lurk in the background of my machine, observing a serious chunk of my online existence. Anthropomorphism again.

It is a decade since the last big change in user interface metaphors. With the first iPhone in 2007, Apple gave us touch that just worked. It was initially applied to skeumorphic designs that clued the user in to the interaction modes via traditional looking analogue references. Users got used to always available compute capability, and as the computational power increased first with the mobile device processors, and then with super low latency cloud interactions that offloaded tasks to essentially infinite processor cycles, the interfaces started to abstract away. Metaphors like Google Now, with its semantically powered relevancy and personalized on-demand information delivery have moved us a long way from mouse clicks and the iconography of the ‘desktop’ (itself a skeumorphic echo of a more distant time). Alexa feels very much like the next logical step. A 1.0 version of a complete way of interacting. She’s not perfect, she can be awkward, but the things that just work, are simply brilliant. How brilliant? Read this heartfelt description of how Alexa has altered the quality of life for somebody who is dealing with the stark reality of the slide into dementia. Gives one a moment of pause doesn’t it?

This time, it’s different, it’s not labor that’s being replaced, it’s thinking. The repetitive but skilled processes that form white collar roles are now in range. Just how many non-repetitive roles are there in the world of scholarly publishing?

Okay, so what. How is this of relevance to scholarly publishing?

Let’s take a look at an Alexa skill called ArxivML. It was written by Amine Ben Khalifa, to allow him to scan the Machine Learning literature updates on arXiv whilst getting ready for work. Alexa will read out the abstracts of the ones Amine wishes to delve into further, and a more traditional title and abstract summary will be deposited into the Alexa app (where all your interactions with her are documented for posterity). The next few iterations of functionality aren’t exactly hard to think of, and not that hard to achieve either.

  • Alexa Send to [Mendeley/zotero/DodgyFacebookForScholarsSites]
  • Alexa Get me The PDF
  • Alexa Share with …
  • Alexa Save to my filestore
  • Alexa Get the data from the paper
  • Alexa Alert me when the authors are speaking at a conference

And so on.

When you are trying to think about AI, here’s what you need to consider. Don’t worry too much about the algorithms and the deep learning and the neural nets and so on. These things are interesting, but they are the details of how this stuff happens. Instead, realize this; you are thinking about a robot and what it can do. But not any robot., not the traditional sort that’s fluent in over 6 million forms of communication, or wants your clothes, your boots, your motorcycle, nope this is (as Bruce Schneier has brilliantly and chillingly put it) a world size robot. Alexa, like Google Home and Cortana, consists of audio (and video) sensors, a mindblowingly powerful cloud brain (where the AI lives, distributed across millions of servers around the globe) and actuators in the form of code or a multiplicity of code controllable switches. Those three things define a robot.

And what do we do with robots? We use them to replace humans.

This past week I watched in amazement as a bunch of cloud experts demonstrated how you could couple modern software development techniques to Alexa so that with a few voice commands, you could get her to build an Amazon Webservices environment — the full stack; the whole enchilada. Put more bluntly, over the course of a couple of sprints, they replaced a fairly respectable IT role, using £150 worth of hardware and a few more quid’s worth of on demand storage and computation. They didn’t even need to fire up a server to do this (You don’t need to for Alexa code). I reckon this future is one or two Moore’s Cycles from demo to cold hard reality. (By the way… recall where Amazon started back in ’95 — something something books mail order something). There will be many more human replacement cycles in the next ten years. This time, it’s different, it’s not labor that’s being replaced, it’s thinking. The repetitive but skilled processes that form white collar roles are now in range. Just how many non-repetitive roles are there in the world of scholarly publishing?

Right now you can go and spin up a general purpose image processing AI (Amazon, Google, IBM and Microsoft have these as a service) and go feed it a bunch of Northern Blot data. And with that, you can go spot any re-purposed or faked data. The publishing industry will need exactly one of these. It will need exactly one of the many other automatable functions. How about a Peer Review process that pre-emptively scans the input document for plagiarism and alerts not only the publisher, but the authors’ institution and funding agencies when it detects evidence (compiling the report, looking for previous infringements by the authors — even ones that predate it’s existence but are discoverable in the existing literature)? What will our ecosystem look like in a world where you only need one AI to perform a given scholarly function? A series of AI libraries all accessible via APIs…

In 2011, Marc Andreessen wrote “Why Software Is Eating the World“. Six years on, I’m still having conversations with people who genuinely think that publishers aren’t and shouldn’t be software companies. I guess they think Netflix is a mail order DVD company and Amazon ships books. AI is going to eat the world, and this time, it’s Scholarly Publishing that has the juicy data with which to feed the beast. One the one hand, AI is going to rip through our value and production chains like Ash’s chainsaw; On the other, there’s going to be a bunch of money to be made supplying high value data to AI’s to enable them to do ever more sophisticated things. Unpicking and understanding this, is the challenge for the next decade and beyond.

“Alexa, Open the Pod Bay Doors”

David Smith

David Smith

David Smith is a frood who knows where his towel is, more or less. He’s also the Head of Product Solutions for The IET. Previously he has held jobs with ‘innovation’ in the title and he is a lapsed (some would say failed) scientist with a publication or two to his name.

Discussion

14 Thoughts on "Living with an AI: A Glimpse Into The Future"

This is probably an issue of whether there is enough training data to emulate a house style. But yes, is the answer in a general sense. There are already tools out there that can provide precis of texts to a given length so a generic copyeditor AI seems perfectly feasible

Suddenly I understand how luddites must have felt in their time. Alexa’s going for a swim if we ever cross ways; see how quick AI can figure that out…

Alexa isn’t just (or even) the cylindrical device. She’s everywhere you can incorporate an Amazon Web Services connection. She’s soon to be in cars (think about that for a minute – a car is now a connected computer with an engine and seats) and TVs and more. https://www.wired.com/2017/01/ces-alexa-in-everything It’s worth pondering this at some length. Your observation about the luddites is important as they were responding to one of the last global technological transformations, especially as it pertained to working practises. We are already seeing some of these things play out (the gig economy springs to mind – to what extent is it predicated on these first AI steps?)

It’s toddler SHODAN ffs. How this thing is anywhere but at the very bottom of our society’s uncanny valley is mind boggling really.

And yes the gig economy reflects its dangers perfectly — it’s technology ordering people around, not people utilizing it for their own benefit (bar the minority developing the technology; they benefit of course).

Very interesting posting David, thought provoking with some great examples. I have been doing some similar ‘playing around’ with mine, albeit don’t have a shed or shed light, ahh …. two points that do come to mind though:

1) your post has inspired me to bring out Alexa to one of our local SSP Philadelphia networking meetings coming up soon, I’ll keep you posted how she performs moderating the AI round table, perhaps we can chat offline about some ideas !!
2) did you see this video going round, where the user asks Alexa if she is connected to the CIA, it seems like the machine or Amazon might have upgraded the answer now, as I don’t get the same results, but interesting ehh http://www.dailymail.co.uk/sciencetech/article-4298698/Woman-asks-Amazon-s-Alexa-s-connected-CIA.html
Still a lot of potential publisher benefits out there to think about and try out, the real skill is figuring out what these are right ? … I wonder if the next new cool job title will be Chief Alexa Officer, CAO !!!

Happy to chat offline!

My Alexa is very clear – she does not work for the CIA. She doesn’t know who MI5 or 6 are though. When I ask her if she’s connected to British Intelligence, She ‘has trouble, please try in a little while’. Make of that what you will. (Seriously – you can use tools such as Wireshark to see when Alexa connects to the backend servers. She ISN’T sending your conversations back to base all the time – she does have an audio buffer for picking up the trigger word – this is recognised within the device and the follow on audio is sent out to be interpreted into a query).

We are only scratching the surface to be sure.

I have to say that I remain amazed at the fact that people are willing to pay for the privilege of being spied upon. Perhaps I’m old fashioned, but I still value my privacy, and struggle to come to grips with our surveillance society, where we’ve gone from fearing Big Brother to paying to invite him into our house. While yes, Amazon promises they don’t constantly monitor your home, I’m never sure how much I trust them. We also have the open question of whether the recordings generated by that live microphone can be subpoenaed by the government (https://www.nytimes.com/2016/12/28/business/amazon-echo-murder-case-arkansas.html?_r=0) and used against you. All this for the great joy of saving a few steps to flip a light switch or push a button to play a song. I’m not sure that’s a tradeoff I’m willing to make.

Further, I’d be interested in hearing how having the device in your home has impacted your Amazon shopping behavior. At least at the moment, selling you stuff is how Amazon makes its money, and most of their devices seem designed to further lock-in customers to their sales services. As you note in the post above, the digital world tends to drive consolidation (see also https://www.spectator.co.uk/2017/03/how-technology-exacerbates-the-winner-takes-all-effect/#). I’m not sure having only one source for retail goods is beneficial to society.

(maybe this is a post…)
Look the general surveillance world arrived 10 years ago. Over here, European politicians are doing things like GDPR (General Data Protection Regulation) which are much needed first steps to put some regulation around what companies (of all sorts, publishers included) can do with personal data. This isn’t happening in the US. But the reality is that we live in a sensor rich environment built on top of cheap ubiquitous compute capability. It’s a good thing that plane engines stream back real time data on their status back to the manufacturer – where’s the line drawn on a lease car that does the same? This is absolutely a massive issue but fact is, we want our phones to tell us where to go and where to eat and all those other things…

My Alexa isn’t hooked up to allow ordering from Amazon. But I’ll say this – “Alexa add [Thing] to the shopping list” is just genius. So much better for building the shopping list through the week. And frankly, if you are disabled, voice connection might be the thing that helps you get back some much needed dignity and independence.

This is a complex issue full of all the shades. What’s needed is a proper discussion on these matters.

Comments are closed.