maeveMy first draft of this post was written in my Gmail account, so that I could try out a much talked-about new browser plug-in that aims to make me “Just Not Sorry“. The app monitors my writing, alerts me when my tone is apologetic, and provides snippety guidance on why my choice of language might be undermining what I am saying. The backstory posits that women use apologetic language more than men and that the app will help them to rectify this.

By positioning this as a gender issue, the app’s developer has generated substantial controversy. Many have concluded that the app itself is sexist in its assumptions (a) that women use softer language than men (it’s not clear whether there is any evidence to support this assertion — my own brief searches haven’t found anything conclusive) and (b) that the answer is for women to toughen up their language, rather than for men to soften theirs (and, indeed, the assumption that men don’t already use soft language just as readily as women, it often being contextually useful or diplomatic to do so).

Putting the gender issue to one side, what interests me about Just Not Sorry is the potential role of code like this in editorial streamlining — an automated way not necessarily to strip apologetic language out, but at least to encourage people to weigh each word and delete those that are likely superfluous for the context. Sometimes, as I intimate above, we can use “apologetic” or “indirect” language to soften a tough message (the Steven Pinker video that David posted before Christmas touches on this).  But quite a lot of the time, we don’t need so many “I thinks”, “maybes” and “justs”, and the app might usefully be developed to flag up other such padding words (perhaps “with all due respect” or “for me personally” or the many other tautologies, eggcorns, malapropisms, mondegreens and so on that obfuscate our communications). The app developers take requests; future iterations could play a wider role in helping authors self-edit their prose.

Thinking about the extent to which authors self-edit, and their goals and skills in doing so, reminded me of an SEO workshop I participated in last month at the British Ecological Society’s annual conference (slides and write-up). The workshop aimed to help authors understand the importance of optimizing their work for search engines — which boils down to two things: language and links. The language piece is essentially a process of self-editing, and working to make your communications clear not only to your range of human readers, but also to your robot readers. Of course, the algorithms of search engines are ever shifting, but we talked through some of the accepted “knowns” — the Goldilocks approach to keywords (not too broad, not too narrow), the need to consistently use these keywords in your abstract without “stuffing” it full of them — and the need to keep your titles short and descriptive.

This was perhaps the most thought-provoking part of the workshop, as so many authors (I include myself) have tended to structure titles with an initial “interesting” clause followed by a second, more descriptive clause. (Witold Kieńć gives a nice example of this in his OpenScience blog posting: “‘Therapy X decreased mortality in Y disease in a group of forty males’ is a much better title than ‘Victory on an invisible enemy: success in fighting disease Y with therapy X’.”) Yes, the latter might catch the eye when skimming a table of contents but what proportion of our readers are doing that any more? Our most important reader is Google, and it likes its titles short — I don’t think it’s a coincidence that yet another study (published in The Royal Society’s Open Science journal last August) has found that articles with shorter titles attract more citations; as well as being easier to understand, they’re likely to have higher rankings in search engines and therefore more easily found. It’s also likely not a coincidence that not a single one of 2015’s most talked about papers (according to the Altmetric 100) employs the fun:factual title style.

As ever with conference workshops, only a subset of the delegates made it to our workshop. They ranged in career level and all seemed to find the guidance worthy of note-taking, so clearly there are some simple tips that could be more widely conveyed — but how? Sure, publishers and institutions could fold more SEO training into wider writing-up-your-research guidance, but not everyone uses such resources. What everyone does use is writing software. I wonder if we’ll see Authorea, Overleaf or even Microsoft Word taking a leaf out of Just Not Sorry’s book and providing in-context hints about writing style. Scholarly authors may not need to be told to make their language less apologetic, but many of us could do with tightening up how we communicate, for our human as well as our machine readers.

Charlie Rapple

Charlie Rapple is co-founder of Kudos, which helps researchers, publishers and institutions to maximize the reach and impact of their research. She is also Honorary Secretary of UKSG and Associate Editor of Learned Publishing.

34 Thoughts on "The Future of Writing: Tightening Up our Communications, From Just Not Sorry to SEO"

What if we are heading to a future where author just writes a core text, and that is then tailored, automagically, in style and tone for different audiences, to maximise desired impact?

So you and I would actually receive very different versions of the same message, if I am, say highschool dropout male from Sweden, and you a PhD female from China. This of course already happens in advertising. But what if it also applied to articles, facebook posts, etc. Dystopia or not?

Interesting idea! In advertising, the advertiser (or their agency) achieves this by creating lots of different versions of the ad, which are then tagged for different audiences – right? So the tailoring of the material is a pretty manual process, with only the realtime rendering of it being “automagic”. It’s interesting to think about who the “agency” might be, in terms of tailoring the academic material. Maybe one day we really will see software that is sophisticated enough to do this automagically, but I suspect that’s a way off. In the meantime, comms team at publishers and institutions do a bit of this, or they work with outside organizations (Research Media et al), or maybe the authors do it themselves (Kudos et al). In all cases the manual effort involved means it’s still fairly broad brush in terms of the number / granularity of audiences that can be served. I like the idea that a writing app could suggest different potential audiences for you, and propose different ways you might customise your work for each – so still a bit manual, but with a lot of algorithm-based guidance. For me that’s neither utopia or dystopia, just an interesting approach to explore!

Generally it’s only in the humanities and softer social sciences where you get the “fun:factual” type titles. When I use colons in titles it is usual Topic: method or Topic: case study.

supposed-to-be-witty:factual happens in biology suprisingly often. Quilty myself. Though only a co-author in the most egregious one – “Caviar in the rain forest”

At least that article is actually about rain forests, as opposed to some of my own – completely metaphorical – no-one searching for the topics would use the words in some of my titles. Great to think about this 15 years too late!

Here’s an example for you, Joe Esposito’s post on mergers and consolidation in the scholarly publishing market:
It’s a really good post, but saw much lower readership than similar posts with more explanatory titles than the whimsical Cole Porter reference of “Birds Do It, Bees Do It”. I consider it a lesson learned, at least here at The Scholarly Kitchen, where much of our traffic comes via Twitter, and whether someone clicks over from a tweet likely depends a lot on how clear it is what they’re going to see.

David is absolutely right about this. The headline poetry that appeals to us as humans is not picked up by Web spiders. It’s a jarring change to have to make, but make it we must.

Yes, it is jarring – and I suppose the key is to consider one’s goals for each piece one writes (and each place one publishes). If part of what you hope to achieve is search-engine-sourced traffic, then you have to consider the prosaic before the poetic. But I think / hope there’ll continue to be a place for the poetic!

Three categories of finding things, not two. Human discovery, as when we see the headline of an article in a print publication. It’s amazing to me, for example, to see how well-laid-out the Wall St. Journal is in print, but it’s a bear to navigate online. Category two is SEO: you need to write for the machines. Category three is what David was referring to, where humans encounter things in a compressed form through social media. So a tweet that says “birds do it, bees do it,” which was the title of one of my blog posts, tells the user nothing. Few people will click on the attached link.

Quite agree – very interesting, and somewhat depressing. The abruptness of much email content already jars. Now you have robot intervention in your writing; the logical next step is that the entire article is written by a robot, for robot readers. The human just provides the bullet points. The poetry in its language may mean that a whole ocean of historical writing vanishes into the dusty archives because it is undiscoverable by Google Scholar. Jane Austen must be spinning in her grave.

Oh but there’s surely a distinction between writing styles for conveying information (where maybe your dystopian vision is bang on) and writing styles for entertainment and escapism (Jane Austen can rest easy)?

There are lots of statistical analyses that make causal claims between academic writing and citations. Some investigate whether a title, posed as a question, is better than a direct statement. Others look at wit and humor in the title. The vast majority of these papers pose no causal explanation for the effect and I expect that most are the result of spurious correlation. Highly-influential multidisciplinary journals prefer short titles, while disciplinary-based journals prefer longer, descriptive ones. Authors lacking an important finding may chose wit and humor in a title. While these studies show statistically significant results, the effect size is usually tiny and has little practical significance. It also makes me wonder if there was an ideal way to write to maximize exposure and citations, why we would not have come across the method through simple trial and error?

Hi Phil – certainly it becomes even murkier once you bring SEO into it as a factor rather than just the response of “human” readers to how material is written up. Of course search engine algorithms / preferences are evolving all the time so it’s not easy to analyse, or provide specific guidance – I guess my question is not so much “what guidance should we give” as “are authors even thinking about this at all” – and if not, what might be the best vehicle for training / reminding people to think about different audiences (machine as well as human) as they write.

The gender issue is actually a really interesting way to frame that extension. There’s a piece in the Washington Post from October that discusses why women are either overtly or covertly punished when we use blunt and direct language. Direct language intimidates and threatens other people in the room because it violates the passive, weak role that women have been assigned and need to conform to in the workplace. (Otherwise, you get marked as “too assertive” or “argumentative” or “not a team player” on performance evaluations. I soften and weaken my language consciously unless I am working on a report or article because I am too assertive.) The extension sounds like a good way to check that one has switched modes.

Here’s the (humorous) piece from the WP:

Thanks for the link! I’m definitely wishing I had more time to explore the issue of gender and communications, since of course it’s not all about language – directness or assertiveness of tone will often be as much the issue as the language itself (maybe we all need Babelfish as well as browser plug-ins). But I agree that the extension is handy for checking oneself – whether or however you then choose to act on what it flags.

I regret to inform you that his name is actually Steven (not “Stephen”) Pinker. 🙂

Oop! Embarrassing. Thanks Philip (and for the softness of the language of your correction :-))

Have you read:

“The history of the wars of New-England with the Eastern Indians; or, a narrative of their continued perfidy and cruelty, from the 10th of August, 1703, to the peace renewed 13th of July, 1713. And from the 25th of July, 1722, to their submission 15th December, 1725, which was ratified August 5th, 1726”

by Samuel Penhallow

Not exactly succinct but it does capture the general gist of the book.

A grand example! It’s an interesting point to think how time, as much as gender or culture, is a factor in the effectiveness of our communications. Many people today may find the book insightful. How many would use terms such as “perfidy” in their search? Indeed (I’m not expert) but is “Eastern Indians” still a phrase anyone would likely use? To what extent can any of us future proof our language? In Samuel Penhallow’s day you didn’t have to worry about search engines and could rely on your publisher and / or your friends to ensure your book found its audience. I wonder what title he might have picked for today’s more attention-deficient, machine-filtered audiences!

I recall when search engines and key words first appeared. As an editor, I would query authors as to their selection of key words and ask: Would anyone use this keyword to find your article?

Harvey – that’s interesting – some of the people at the workshop I mentioned were editors, and they were finding it equally thought-provoking, acknowledging that they hadn’t much taken authors to task about titles and keywords (at least, not from this perspective). It keeps coming back to whether individuals really understand some of the nuances of our current discovery behaviors. Some authors will benefit from more active / knowledgeable editors than others!

Google Scholar uses full text search, so keywords are irrelevant. I assume WoS and Scopus do too, but really do not know because they cost money. Are there still a lot of keyword based search engines in use?

David one can also use keywords with google scholar just google google scholar keywords. Also, if memory serves medical lit uses keywords

Medical literature uses some keyword searching but is more likely to use the Medical Subject Heading (MeSH) system. For instance try searching brain ct in You get five articles with CT Brain as the author. The articles do not in any way discuss the tomography of the brain.

I’m not an SEO expert but it’s my understanding that Google / Scholar does rank a work more highly if the user’s search term is tagged as a keyword (above those where the search term just appears in the full text). I think you are right that there has been a move away from people searching expressly on keywords, but I think in the background the metadata still matters!

‘Time on the Cross: The Economics of American Negro Slavery’ was perhaps the classic instance of the phenomenon described above, and it’s interest to speculate whether the title of Fogel and Engerman’s masterwork of 1974 would have made the cut in an age of SEO. To generalise wildly, this trope has been more of a North American practice than a European, as I think a comparison of the titles of the monographic outputs of OUP and CUP with their American counterparts would make clear. An additional complexity, equally challenged in the digital age and something I first learned from social science publishing eminence grise Michael Holdsworth, was the crucial distinction between title: colon: subtitle, and title: colon: more title

Interesting that you perceive that North American / European variation in practice in relation to the titles bit as it’s also been suggested in relation to the direct / indirect language bit. Churchill still rings true.

Does Michael’s point relate to how the parts of the title are tagged in metadata terms, or simply the purpose of the different “clauses”? I’m intrigued and trying to think of examples!

Charlie, ‘Religion and the Decline of Magic: Studies in Popular Beliefs in Sixteenth- and Seventeenth-Century England’ is a celebrated example of the standard ‘title: colon: subtitle’ iteration. An example of the rarer, but by no means unusual (and as you imply, potentially horrible in metadata terms) ‘title: colon: more title: ( and subtitle)’ formulation might be something like BREXIT: The EU Referendum of 2016: Politics, Myth and Identity in Contemporary Britain’. I am sure that other readers of the SK can think of better published instances of a tri-partite title formulation that works well and logically within the construct of traditional printed preliminary matter, but is much less helpful in the context of online bibliographic search. The latter does not stop, however, the authors of such works being (very) emotionally tied to a title formulation with which they may have lived for many years…

Ah OK, thanks for the examples! And you make a good point about authors being tied to current habits – hence it was so interesting to see people’s ears pricking up when we talked about this in an SEO context – it seemed like sufficient justification for people to reconsider, and gives editors a stronger case for justifying changes they might suggest (as one editor in the room said).

Perhaps a little dated, but an excellent read on women’s and men’s communication styles: Deborah Tannen’s 1990 “You Just Don’t Understand: Women and Men in Conversation.” It’s had me second-guessing my “I thinks” ever since.

