Guest Post - The Perplexing Puzzle of the Top 2% Scientists List

Editor’s Note: Today’s post is by Akira Abduh. Akira is a freelance researcher who is interested in perceptions of publications and their metrics.

Stanford University’s recently updated list identifies the world’s top 2% of the most influential scientists, based on Scopus citation data. This initiative, led by John PA Ioannidis and his team, is renowned for its rigorous use of “standardized citation metrics.” The intention behind this ranking is to provide a more accurate and transparent measure of impact and excellence using citation metrics.

First released in 2019, the list is updated annually, with the sixth and latest version made available on October 4, 2023. The Top 2% list is often regarded as a prestigious benchmark, reflecting a researcher’s productivity and prominence within their field. Despite its widespread acceptance, the validity of the list is not often critically examined. My analysis last year exposed several outliers, and it appears these anomalies persist in the 2023 update.

Although the list’s methodology is clearly delineated, with code accessible on the website for transparency, this does not inherently guarantee its accuracy. In my preprint posted last year, I delve into some of these peculiar cases, shedding light on the list’s peculiarities and the need for a discerning eye when considering its assertions. These anomalies still persist in the latest database.

Anomalies and Historical Inconsistencies

The list is intended to spotlight the top 2% of influential scientists and intriguingly includes 235 authors purported to have publishing careers spanning over 80 years. For instance, William S. Marshall, a biologist from St. Francis Xavier University, is credited with a staggering 187 years of publishing, from 1834 to 2021. William Marshall is a retired professor.

Similarly, Tom L. Blundell of Cambridge University is listed with a publication history beginning in 1853, yet he was born in 1942. Another historical figure, Lord Kelvin, is noted for publishing from 1849 until 2011, including posthumous publications after he died in 1907.

Francis Bonnet, from Sorbonne University, is erroneously listed as starting his publishing career in 1866. Dr. Bonnet is a contemporary professor with a focus on ambulatory anaesthesia and surgery. Moreover, Franz E. Weber from the University of Zurich, a professor of dental medicine, is also inaccurately recorded as beginning his publications in 1866.

The list further includes historical figures such as H. Poincaré, who apparently last published in 1999, Selman Waksman cited as publishing in 2023, Niels Bohr active in 2019, and Albert Einstein with a publication in 2021. These entries highlight significant errors and emphasize the importance of scrutinizing such databases for accuracy, especially when they claim to measure the influence and productivity of scientists.

Hyperprolific and unproductive Researchers

The Stanford list includes some remarkable cases of publishing volume. Gregory Y.H. Lip from the University of Liverpool stands out with 3,807 papers since 1992, averaging 123 papers per year. Elisabeth Mahase is noted as a highly prolific scientist, credited with 212 papers annually, followed by Gareth Iacobucci with 136 papers per year. Both, however, are journalists or reporters with BMJ, producing news articles rather than research papers, yet their work is grouped together with scientific articles. This categorization has led to Mahase and Iacobucci outranking many scientific authors, including the 2023 Nobel Laureate in Physics, Pierre Agostini.

Additionally, the list acknowledges hyperprolific authors like Viroj Wiwanitkit , averaging 130 papers annually. It also highlights an author with 97% self-citation and 27 authors with over 80% self-citation rates.

Contrastingly, the list features authors with minimal publications, like John F. Fulton from Yale with just two papers.Irving Langmuir, with a publishing career spanning 91 years and only two papers, is surprisingly ranked at #606. J. Robert Oppenheimer is also listed with a modest two papers between 1926 and 1966.

Notably, institutions such as the Centers for Disease Control (CDC) and the World Health Organization appear as authors, with the CDC receiving a high rank of #9260. These instances underscore the need for careful examination of the criteria used in compiling such lists, as they can lead to a skewed representation of scientific contribution and influence.

Conclusion

My findings reveal that the database purporting to catalog the world’s top 2% of scientists is marred by several inaccuracies. Some but not limited to:

It erroneously includes researchers with publication dates starting in the 19^th century who appear to remain active through 2023.
It features authors with minimal publications and short career spans who, nonetheless, receive high rankings.
It includes authors with questionable extensive publication records, including many non-research pieces such as news and editorial articles.
Journalists and editors find their non-peer-reviewed articles weighted on par with scholarly peer-reviewed research within the database.
Organizations are listed as if they were individual authors.
A significant number of authors in the database have self-citation rates exceeding 50%.

I call on the readers and the creators of this top 2% list to critically examine these discrepancies. A dash of common sense is essential. Furthermore, the practice of ranking researchers in this manner can be counterproductive, potentially incentivizing them to game the system for a higher placement.

Akira Abduh

Akira Abduh is a freelance researcher who is interested in perceptions of publications and their metrics.

Discussion

12 Thoughts on "Guest Post — The Perplexing Puzzle of the Top 2% Scientists List"

And not a lot of women mentioned there either. Women scientists do not seem terribly influential it looks like. 🙂 It would be interesting to compare actual scientist men/women ratio against influential men/women ratio presented in the list maybe…. Is the citations criteria relevant to determine who is relevant or not anyway, considering as you say, self citations as a start? Maybe not a very relevant list after all.

By F. Leroy
Feb 14, 2024, 7:44 AM

Yes, clearly. There remains being people called Poincare, Lavoisier or Smith in the academia. And, for sure, there are more than two John Smith’s publishing today and the system adds their publications.

And, for sure, Albert Einstein is more influential than me 🙂

By not Albert Einsteins
Feb 14, 2024, 8:39 AM

It is just mind boggling how a university constantly ranked in the top five in the world would make such “mistakes” .It is not like biases or favoritism were involved but Einstein and Poincarré still publishing …….Unless it is ghost writing !!!

By Samir Hachani
Feb 14, 2024, 9:09 AM

Nicely done! Very ingenious to look for such unusual patterns and outliers. If only Henri Poincaré, Neils Bohr, and Albert Einstein had signed up for ORCID there would be no confusion. 😉

By Jeffrey Demaine
Feb 14, 2024, 9:14 AM

Knowing what I know about academic publishing, the list is irrelevant…how can someone publish so many paper in one year (212!? 136!? and more) and does it mean that these papers are relevant or influential? Appreciate Scholarly Kitchen giving space to Akira Abdu
and not sure why Stanford University is promoting/endorsing such a study that casts a doubt on the entire system of citing and ranking.

By Carla Santilli
Feb 14, 2024, 9:25 AM

To me what this post speaks to is the poor data hygiene in the main bibliometric databases for scholarly publishing. I’ve not spent much time using Scopus (which was used in the study discussed) but the other databases are similarly riddled with anomalies and errors. To effectively use them, one must build a continuously evolving and ever-more complex set of search strings to compensate for these errors and hopefully end up with usable data. The problems start at the publisher with poor metadata coding for the paper, then they’re deposited in Crossref where the errors persist, and then the databases pick up those errors when they draw on material from Crossref.

My questions are:
1) When one finds such an error, to whom should it be reported — the publisher of the paper, the holder of the metadata record (Crossref), or the database where the error was discovered?
2) Who is ultimately responsible for correcting the scholarly record? Should the publisher be correcting and resupplying the metadata, should Crossref clean up its holdings, or should the databases be adjusting their results to eliminate known errors?

By David Crotty
Feb 14, 2024, 9:56 AM

Hi David – Crossref here 🙂

If you spot an anomaly in metadata, the publisher is the best place to start. At Crossref, we don’t change the metadata provided by our members unless we’re matching relationships. If a change is required, this needs to come from the member updating their metadata records. If we get issues reported to us, we pass this on to the relevant member and ask them to make the update themselves.

Many of the scholarly databases make use of Crossref metadata, so asking the publisher to correct any issues in the metadata they’ve registered with us is probably the easiest way for the correction to then feed through the ecosystem.

By Amanda Bartell
Feb 19, 2024, 9:50 AM

Thanks Amanda. I’ll start compiling my lists for various publishers….

By David Crotty
Feb 19, 2024, 2:51 PM

I understand Crossref’s position, however, it is really challenging to get publishers to update metadata and so these errors persist endlessly. Is there an alternative where Crossref (as a member service) could validate reported errors in publisher metadata and update or query with record owner? People are willing to correct the record to improve data hygiene, if someone will give them the entry point. Personally I’m not sure how we get to the research nexus, without a tool to correct metadata errors.

By Hannah Hope
Feb 19, 2024, 4:53 PM

This caught my eye – I can confirm that I have witnessed a scientist asking for colleagues to be added to a paper they were reviewing. They were politely sent away, but their final remark was to say they were in the top 2% . I looked at their list: 10 months into 2023 and they had already co-authored over 40 papers. Though strangely they did find time to actually review the paper!

By TooManyPapers
Feb 14, 2024, 11:07 AM

One very strong bias you seem to have missed: Ioannidis weighted the various factors in a manner that, somehow, puts him in the top 50 scientists in the world. He has been in the top 50 since he first published his list, and it’s clear to some of us that his real motivation was precisely that: to promote himself.