You, John Jones!
You, John Jones! (Photo credit: Wikipedia)

Message to editors and publishers of scholarly journals: Please, please, please publish whole names. In the name of discovery, abandon the time-honored traditions of last name plus initials, partial names, etc. Why? The answer lies in Asia, especially China.

Journals using just initials, or even partial names, have always been a problem for those of us in the discovery business. You give us papers by J. Jones, J. J. Jones, John Jones, and John J. Jones. Which of these are by John Joseph Jones, which by John James Jones, John Jeffrey Jones, or by John John (remember John John?) Jones? Which are by Jane Jones, Joan Jones, etc?

Simple name matching does not solve this problem, so this stylistic (and seemingly pointless) shortcut to naming has spawned an entire research community. Do a Google Scholar search on “author disambiguation.” Many attempts are being made to develop semantic algorithms that use article content to distinguish authors with similar partial names. Of course, this requires access to article content, which is not easy to come by, another problem.

But new research indicates that this already messy naming problem is far worse with Chinese authors. And the number of scholarly articles published by Chinese researchers is approaching that of those published by American authors, the largest single country group. In fact there is active debate in the bibliometric community as to when China will pass the USA in scholarly output.

According to a Science magazine news item, a study appearing in the American Journal of Physical Anthropology has mind boggling numbers regarding the extremely low density of last names in China versus the very high density in America. A study of 18 million people in the USA found nearly 900,000 last names, or roughly one last name per 20 people. This is the measure of the melting pot and it makes last names a credible filter of sorts, albeit a crude one.

But the number of family names found among a whopping 1.28 billion Chinese is just 7,327, or roughly one last name per 200,000 people. This name density difference is roughly 10,000 to one. Metaphorically speaking, almost everyone in China has the same last name, while almost no one in America has the same last name. It follows from these numbers that trying to differentiate Chinese authors using less than their full names is probably a hopelessly coarse filter.

I suspect this problem is also found among other countries which have not been the locus of waves of immigration. In fact, I recently had a small discovery project founder on the ambiguity of scientists named “Kim” in Korea. Perhaps there are deep anthropological, cultural, or sociological reasons for these low name densities, but that is irrelevant for present purposes.

The scholarly publishing community must use full names if discovery is going to work. We are trying to show the public that publishing adds value, and discovery is a big part of that value. I realize that there are several budding attempts to provide the ultimate solution to this problem, which is numbering everyone, such as ORCID (Open Researcher and Contributor ID) and INSI (International Standard Name Identifier). I am not optimistic about this sort of global collaborative effort, there being several million STM authors. But in any case, it would be a huge advance if the journals simply published complete names. That should be easy enough to do, unless it involves major software changes or some such.

D. E. Wojick (what’s my name?)

Enhanced by Zemanta

Discussion

25 Thoughts on "Please Use Whole Names on Scholarly Articles"

Another source of the problem is style guides, such as the Publication Manual of the American Psychological Association, that mandate the use of initials for forenames.

It does not surprise me that the problem is institutional. Even the simplest change can be difficult. For example, how often do they revise that manual? How does one propose changes, etc?

People from some parts of India, like me, don’t have a real last name. My name on papers in the Journal of Applied Physics appears as R. Murugesan. But I don’t think of this as my name. Murugesan is my dad’s first name, and I took it on as a last name to conform to immigration requirements when I went to the US. Before then, my initial was M and my name was Ravi. In the scholarly record, my initial is R and my name is Murugesan!

I’ve heard that a lot of people in Indonesia go by just one name. I wonder if there are journals that insist upon initials, and if there are Indonesians who’ve made up initials for that reason.

Even more of the ambiguity of Chinese names can be removed if Westerners also included in the byline the Chinese characters, a person’s REAL name. See for example, http://pra.aps.org/PhysRevLett.99.230001 , where the American Physical Society allows authors to put their Chinese character name in parenthesis after their transliterated name. Although westerners may not be able to read the characters, with a little practice, it is easy to compare characters to see if they are the same, something that most of our editors have learned to do in choosing referees. It is important for ORCID to include this information in the bio file for each person.

Good point Gene, and congratulations on the innovation. However, in order to support discovery the search engines will need to search on the Chinese character name. Do any do that now?

But regarding ORCID, my simple understanding is that they are using a 16 digit identifier for each author. It is not clear how having Chinese characters in the bio helps. But perhaps there is more to their search scheme than I know of.

I am John Spevacek, an industrial researcher in the area of polymer science. In the Czech Republic, there is an academic researcher in polymer science, Jiri Spevacek. One year during my performance review, I tried to convince my supervisor that all the J. Spevacek papers that Jiri had published that year were mine. Can I say I like the use of initials, if not just for all the laughs we can have?

Indeed John, ambiguity is often the heart of humor. One of my semantic favorite jokes goes like this. The stranger asks the old farmer “do you think it will rain?” The farmer thinks for a while then answers “It always has.” (The ambiguity is between soon and ever as the meaning of will.)

In fact Nature has a recent article on this issue, by Declan Butler, whom I admire, that begins with your sort of case. See http://www.nature.com/news/scientists-your-number-is-up-1.10740 The joke, if you want it call it that, is here:

“In 2011, Y. Wang was the world’s most prolific author of scientific publications, with 3,926 to their name — a rate of more than 10 per day. Never heard of them? That’s because they are a mixture of many different Y. Wangs, each indistinguishable in the scholarly record.”

While planning the revision of the last edition of the Publication Manual of the American Psychological Association in 2009, we struggled with the issue of how best to handle names that didn’t lend themselves to Western conventions. As Ravi notes, the confusion is confounded not only by initials v. full names but also by the accepted order of the first and last names. We ultimately advised readers to follow the form that authors most often used in their publications.
Thanks for a very sensible suggestion. We’ll be keeping this, along with the recommendations of ORCID, INSI, and other groups, in mind for the next edition.

A compelling case, but there is one good counterargument. If full names are used, people who have obviously female first names, or first names that are perceived as, say, typically African-American, are likely to experience bias (unconscious or otherwise) in the way their papers are read and reviewed. This is less likely to happen under the current convention of using initials.

Good point, and one I had not thought of, being basically a discovery nut. My father changed his name from Wojcik to Wojick just to Americanize it a bit. The ORCID solution speaks to the bias issue. But I have to wonder if bias is so great a problem in scientific publication that anonymity trumps discovery.

Early in my career I came across a paper showing that author gender affected how a paper was perceived. It was earlier than this one but similar: http://tinyurl.com/co5o2p5. I also was aware of a bias in my own evaluations – somehow a lurking sense that male authors were more authoritative – I’m appalled at myself, but I’m still aware of it, though I resist it. So right from the outset I published with initials D.V. M. rather than Dorothy. Fortunately, the three initials are distinctive enough to identify me uniquely on Web of Science. I have resisted requests by editors to change to full name, though by now I am known in my field and so I suspect it makes little difference. Thanks for highlighting the problems of reliance on initials. Maybe authors should just be given the choice, rather than requiring uniformity on this.

This is a fascinating issue. We used to see a convention in the phone book, where men used names and women used initials. I would hate to see that in science. For one thing it might give men a discovery advantage.

As I think about it, ORCID is not a solution to the bias issue, unless we stop having names on papers altogether.

In Latin America, we use as last name the fathers name, and after that, we put the mothers last name. So I am called Vivienne Bachelet Norelli. But I get cited by anglos as Vivienne Norelli. All wrong. The problem lies in the dominance of English when the world is Spanish, Chinese, Indian…

So just sign your papers as Vivienne Bachelet! My Latin-American colleagues use that approach, my Italian wife has both our surnames in her documents, but uses only her surname before marriage on her scientific publications, etc. You can complain all you like about the dominance of English language and Western mind-sets in science, or you can just use a practical approach to deal with realities and get on with your science.

I also suffered from name ambiguity and until GoogleScholar came along there was no way for me to determine the collective impact of my work. Although I exited academic research several years ago, the name ambiguity haunted me for several years in measuring the impact of my scholarly output. See my recent blog post on this issue.

My female postdoc has only a first and last name, which are both common western names. To differentiate herself and make her more searchable she decided to give herself a middle initial (we work in chemistry where first names are always given in full on the paper). After two weeks of suggestions and compiling a short list, the decision was made. Her middle name is now: danger.

Cool! My mother’s maiden name was Wildman. I considered making it my middle name, so I would officialy be Wildman Wojick.

I am not sure spelling out names in full are that helpful, there are enough John Smiths around to confuse anything, let alone Wei Wangs. ORCID or some other identifier is the way to go, just like DOI for individual papers, and avoid the Chinese character issue (I don’t know Chinese, let alone how to type it). Assuming all those 16 digits are used we get over 1000 billion combinations, which should last us a while…

Middle names would certainly help the John Smith problem. I have no data on Chinese naming, other than that cited. ORCID is indeed the ultimate solution, but it may be a long time coming. Full names are an immediate help.

Ed, that is the launch date. I am talking about when they will have, say, 80% of the world’s authors registered, especially the Chinese and similarly homogeneous name populations. That could be a decade or more, or never,

Comments are closed.