Artificial intelligence (AI) has been creeping into our lives, first taking on complex, but narrow, tasks like optical character recognition or games like chess and Jeopardy. In recent years, the science of AI has advanced to a point where computers are assuming more meaningful and significant roles, like personal assistants for your phone and self-driving cars.
A similar game-changing claim was made recently at the Frankfurt Book Fair by Meta, a Canadian startup based in Toronto. Their product, Bibliometrics Intelligence, proved far better than seasoned science editors at identifying and selecting the best manuscripts, Meta claimed in a joint press release on Oct 17th with Aries Systems:
Large-scale trials conducted by Meta in partnership with industry demonstrated that Bibliometric Intelligence out-performed tens of thousands of human editors by a factor 2.5x at predicting article-level impact for new manuscripts, prior to publication. It also performed 2.2x better than the same group of editors at identifying “superstar articles” – those that represent the top 1% of high-impact papers, prior to publication.
If Meta’s claim is true, the traditional model of editorial selection and peer review is ripe for massive disruption. As a senior editor quipped to me by email, “Now I can fire all my editors and run everything by algorithm!”
Intrigued by Meta’s truly significant claim, I wanted to read about their results in more detail. There was no link to the study in the press release, so I contacted Meta and Aries to request a copy of their paper. My request was denied on the basis that the company was in the process of publishing the results of the study and that I could get a copy of the paper once it was published. In its place, they sent me a brochure.
As a company heavily invested in selling their services to biomedical journal publishers, this response was more significant than claiming to upend the traditional editorial and peer review system. Why? Because scientific statements are supposed to be backed by evidence–not the promise of future evidence.
Meta could have dealt with this issue by releasing a manuscript of their findings, something that Academia.edu knew to do when claiming that their commercial repository increased the citation performance of uploaded papers. Nature knew this as well when they commissioned a citation study of open access papers published in Nature Communications. While I publicly disputed both studies, these organizations understood that they could not make bold claims in this market without a public document. No document, no claim. It’s that simple.
Biomedical publishers know this rule intimately, gave it a name (The Ingelfinger Rule), and often go to great lengths to time the release of a major study to coincide with a conference presentation and press release. If Meta was anxious to announce their results to coincide with the Frankfurt Book Fair, they could have deposited their paper into a preprint server like bioRxiv or the arXiv. It has never been easier or cheaper to get results out quickly.
Meta may indeed be conducting ground-breaking AI research that will truly transform the way scientific papers are selected and evaluated. Nevertheless, by following a marketing-first-results-later approach, this company has signaled to the scientific publishing community that it fundamentally does not understand scientific publishing.
To me, this was a major oversight.
Discussion
49 Thoughts on "Can An Algorithm Outperform Science Editors?"
I wonder, does saying that he/she will “fire” the editors mean that they are being paid for their work? Also, publishers and their consultants are now suddenly in favour of preprints and getting results out fast and for free? And finally, if editors, as you imply, are essential for selecting reviewers, perhaps they should stop asking for suggestions from authors. I am sure they would be the first to celebrate an algorithm that they could secretly use to select reviewers. And yes, “the traditional model of editorial selection and peer review IS ripe for massive disruption”.
If a reviewer’s primary job is to say which submissions are worth publishing then this algorithm might replace reviewers rather than editors. The editors would process the algorithm.
Depends on what “worth publishing” means. From the publisher’s perspective you assume that it means attracting citations. From the scientist’s perspective, however, anything scientifically valid worths becoming public. Assessment of validity is therefore the essential process and one that cannot be replaced by algorithms, but needs reviewers. There is no discussion about that. The only open question is whether the selection of reviewers needs editors or not. It remains to be seen whether this or any other algorithm is capable of selecting more appropriate reviewers for any given paper than editors do. If this is proven then editors become obsolete. I wouldn’t worry about reviewers.
I have in fact developed an algorithm that beats editors in selecting reviewers, because it finds every researcher whose work is close to that being reviewed.
I do not know what you mean by validity, but most journals also include importance as a selection criterion.
No, but I think understanding the work being reviewed is a necessary condition for being a good reviewer.
I am not talking about a related field. My procedure finds and ranks everyone whose work is very close to the work to be reviewed.
Again, while subject area knowledge is necessary, it does not automatically make one a good reviewer. There is more that goes into a good editor’s decision process of finding the right reviewer.
Again, I do not claim to find good reviewers. I find highly qualified reviewers that the editor does not know about. Even the narrowest topic has far more researchers globally than anyone can know about. That is the problem that my procedure solves.
My understanding is that journals typically maintain databases of potential reviewers listed by broad categories. My approach is different, namely to find just those potential reviewers who are most familiar with the work in question, ranked by closeness.
Again, I do not claim to find good reviewers.
One would think that the phrase, “I have in fact developed an algorithm that beats editors in selecting reviewers,” would suggest just that.
Make it finding the most qualified reviewers then, from all the published researchers in the world. That is my concept of selection. Repeatedly using a few well known reviewers is not much in the way of selection.
My guess is that editors do not go out de novo and find reviewers who are specialists in the work reported in each article, which my tool would facilitate. Rather they draw upon an established stable of “good reviewers” who are just in the same general field. That these reviewers know much about the article’s work and world is unlikely, but they are easy to find. It is an efficient system but the reviews are likely to be relatively shallow, which creates its own problems.
My approach envisions a more precise review system, but it would also be more complex and probably take more effort on the editor’s part.
My experience would suggest that your guess is an over-generalization, and that practices vary widely at different journals, and that editors have an array of methodologies and tools that they use to find the right reviewers for a paper. Some do a better job of this than others, which is why some journals have a better reputation than others.
I wonder, does saying that he/she will “fire” the editors mean that they are being paid for their work?
I believe that comment was made in jest and not meant to be taken seriously. That said, many editors are indeed paid for their work, although this varies across journals and fields. Privately owned journals very regularly pay editors, and research society journals vary in their practices — some offer a stipend, others see editorship as a service to the community. Some journals are run by full-time editors, who do this as their main profession and are paid a salary. You can look at at the 990 forms of many of the not-for-profits out there to see what they are paying their editors (eLife and PLOS for example).
Mr. Pandelis for sake of clarity and your acerbic tongue you could have mentioned that you are with Open Choice a group that advocates I know not what. I am wondering how those tools you tout which will eliminate reviewers are coming along.
You do not have to be paid to get fired. Every associate editor I have met and/or worked with never sought money for their efforts. They found other tangibles and intangibles far more valuable than money. But, then again, obviously you equate money as the prime motivating factor for what you do. Pity!
Preprints date back at least 20 years. It was the web and lack of the need to move paper manuscripts that made preprints possible. Preprints are paid for by either the Author, if in an OA journal, or by a subscriber if in a subscription journal. But, someone pays you know why? Because there is no free lunch!
BTW I have worked with some of the highest rated IF journals and I never requested that my editors ask authors for reviewers. If a journal does use someone from a list provided by an author, I do not believe the editor would say I have an article by Pandelis who suggested you review it – do you want to review it. That sort of takes the blind out of peer review!
But, as an advocate of OA I am sure you think that because you read something for free that it is free. I hate to burst your bubble – It is not free it is just that someone else paid for allowing you to read it for free.
Lastly, why do you accept a salary? Wouldn’t it be more noble to disseminate your knowledge for free!
The first citation to a preprint that the group writing this review article found dates back to 1922:
Lariviere, V., Sugimoto, C.R., Macaluso, B., Milojevic, S., Cronin, B. and Thelwall, M. (2013) arXiv e-prints
and the journal of record: An analysis of roles and relationships arXiv:1306.3261
Dear Harvey, I would probably be able to offer a better reply if I could understand which parts of my comment touched which sensitive chords of yours to produce this irrational rant full of misunderstandings. But let me try anyway.
It is obvious you do not know anything about “Open Choice” (interesting name maybe for some future project) if you think we want to eliminate reviewers.
I don’t believe editors should be paid and it’s really hard to understand where you got this impression from.
I don’t “equate money as the prime motivating factor for what I do” so there is no need to feel “pity”. Again, I wonder where that came from!
Preprints are free deposits of author copies of manuscripts made accessible for free (nobody has to pay).
I do not suggest authors should request recommendations for reviewers but the fact they do (in other high impact journals you have not worked with) undermines their role as authorities on selecting reviewers.
what authors self-archive I read for free and no-one has to pay to allow access, not authors, not subscribers.
And finally, I accept my salary for conducting the research which later I disseminate for free.
Hope this helps!
As an experienced editor of a science journal, I once tried highlighting in my blog those papers in the recent issue I thought would be of most interest to our readers. One might think these would become the most cited, right? Looking retrospectively at cites, my skill level at picking highly-cited papers was little better than a monkey with a dart board! Very humbling. I suppose AI might be able to do better, but I will believe it when I see the peer-reviewed results.
We did a study of this for the last Peer Review Congress. In a year’s worth of research papers, there was little correlation amongst the papers with the most views and the papers with the most citations.
While there were a very few with both, the scatter plot showed a blob at the null, then three strings out the axes.
We’re a clinical journal so that makes some sense. The science tends to get more citations, and the clinical or epidemiology more views.
It is hard to see how the trial they describe can have been done. It would require that, over the last five years or so, tens of thousands of editors actually predicted the impact of their published papers, prior to publication, using a uniform system of ranking. Has this really been done? I suspect that Meta is just interpreting proxy data, which may be questionable. What they actually did will be very interesting to see.
I first heard about this tool a couple of years ago, so it’s not like there was not time to write something up. Let’s assume this “AI” works. My understanding is that it just analyzes the citations and if previously published and highly cited papers are included it determines that this new paper will be highly cited as well. Please correct me if this is not correct.
IMHO all this does is encourage “citation rigging” (copyright me if no one has used that term already). SOME authors will load their references up with the ‘top” articles, regardless if appropriate or not, in an effort to increase chances of being accepted. SOME editors will start only publishing submissions with predictions of high citations. New research or niche studies might be rejected out of hand because they don;t cite the “right” stuff. Ethical editors and journals will find an already murky and time consuming peer review process made even more time consuming.
Having said this, I do see the potential for a tool like this to be used in a positive manner. What if, instead of checking for high citations, it checked to see if the paper being cited had been retracted and alerted the managing editor or EIC? That’s something I think editors would find useful.
I am skeptical that articles citing highly cited articles are then usually highly cited, especially because most articles do this. Still it is a workable algorithm. However they mention using a great many factors so I imagine it is much more complex. But I doubt they will reveal the algorithm, just the supposedly successful trial.
What I see as a huge problem in this is that it seems as if Meta, and even some commentaries of their study, equates “good article” with “highly cited article”. Such a point of view may perhaps benefit the scholarly publishing industry, but it will definitely be (and indeed it is) detrimental to science and to the dissemination of science.
I think that the idea that highly cited articles are good articles is generally accepted.
Really? Maybe among publishers and administrators that have bought the idea of new public management. However, I assure you that most active scientists do not see it that way, while they are painfully aware that decision makers often do. There is a correlation between the scientific quality and the citation statistics, but the latter is certainly not a measure of the former. It is deeply worrying if you believe that the number of citations is an absolute measure of how good an article is.
This is an important point and indeed papers found to be a trainwreck have been highly cited as an example of the aforementioned trainwreck. Highly cited papers also appear to have the same chances at being retracted.
My issue with predicting what might be highly cited is that peer review and curation of content should be a human task. When I worked on an oncology journal the editor explained the difficulty with Phase 1 or 2 trial papers. He said that the trick was to try to determine which were going to become phase 3 or 4 papers and which were likely to be dead in the water. This is not an easy thing to do and I don’t believe an algorithm can figure that out.
I would hate to give an editor a tool showing them that a paper is “likely” of being highly cited and having their content decisions influenced by this. You want for their review to be pure. We want them to think that they are terrible predictors of what will be highly cited. Why? So they give as many papers as possible a fair chance.
That highly cited articles are good articles is a statistical rule, not a law of physics, so a good correlation is sufficient. Nor are rare exceptions such as negative reviews a counter example. In fact much of scientometrics is based on this rule.
Beyond that, if the Meta algorithm actually worked, which I doubt, then I would prefer it to the opinion of a couple of reviewers. Peer review is a very haphazard evaluation methodology. It is basically drawing a strong conclusion from a sample set of just two or three samples. Statistical sampling theory says this is a very crude way to go.
I think that this deserves to be followed. I looked at the Meta (alpha) site and while they seem to want journal editors, it looked like they encouraged authors to do the reverse and to find a journal with a high impact factor that would consider the article. Thus it would seem that the training algorithm takes into consideration other criteria such as subjects covered, etc- a more complex matrix.
This would imply, as with many AI search engines, there is the capability to scan the article from title and key words to relevant words and the entire context in which it is embedded. As Pendelis suggests, why worry about reviewers. Remember that if the task can be broken into finite steps then the task is a potential for AI to manage.
This type of search in the biomedical area has a number of companies already using AI deep learning search engines to basically do exactly this type of task not just on text.
what makes this important for academic journals includes:
a) the ability to accept articles from a variety of areas and sort into piles, called journals. Or better, to create a database of articles regardless of trying to fit it into an academic box but searchable by researchers.
b) the ability to effectively search, abstract and compare across databases to determine redundancies where authors try to stretch one piece of data into several articles or others have abstracted ideas whole cloth. This is similar to what university faculty use to detect plagiarism.
As some publishers have noted, and acted upon, researchers search for articles which can appear in several journals. Hence, they don’t depend on a journal to appear and peruse the table of contents. Early cite is one response to this. If Meta does the above, then it reinforces the idea, expressed in these comments, that journals are in need of a serious overhaul. Large, rapidly searchable databases of articles or proposed articles (preprints, perhaps) can be reviewed and even sent to “editors” or designated managers, tagged and entered into the searchable database. It’s not just the journal management that needs overhaul but academic journal publishing in the era of Watson and descendants.
Phil,
On Tuesday I attended the STM meeting in Frankfurt. At this event publishers were (rightly) told that if they don’t assume more technology risk they will become irrelevant. It was explicitly acknowledged that with risk comes the possibility/likelihood of failure.
In this context the spirit of your article was disappointing.
Why not applaud investors willing to risk millions of dollars to explore the potential of a new technology? Why not applaud entrepreneurs willing to devote their lives to unproven ideas? Why not applaud companies that offer their customers the opportunity to try new solutions?
OK, there is the possibility of failure. Such early stage new initiatives may not meet the high standards of academic proof. But to validate new ideas there needs to be engagement with users. Without promotion there cannot be engagement. So, promoting a new idea is not shocking.
Through a process of iterative development, Meta may or may not turn out to be a transformative technology to help (not replace) editors and reviewers in ways we cannot yet predict, but it’s an exciting possibility worth exploring.
Richard Wynne
VP Sales and Marketing
Aries Systems Corporation
Richard,
Thanks for commenting. I’ve spent my entire professional career working in scholarly communication, as a science librarian, graduate student, researcher, and now, as a publishing consultant. I’m trained to look critically at new products and skeptically at new discoveries. You may have much more knowledge about what Meta is doing as you have integrated their product into your workflow. For Aries, I hope this is a successful and profitable relationship. My primary responsibility is to publishers, many of whom come to me to ask about the efficacy of new products and validity of new claims. Meta may be doing truly ground-breaking work, but I can’t see it. What I can see is that they are making fantastic claims without showing their work–a behavior that does not follow the values of science and science publishing.
If you want to applaud venture capitalists for investing in new technology for the sake of new technology, I’ll leave this role to you. I’ve been trained to think like a scientist, not a businessman.
There is a difference between being supportive of innovation and in taking an unproven claim as fact. If you’re going to go public with a claim, one should be ready to back it up. This goes doubly in our world of scholarly publishing, where we hold authors to such a high standard of proof.
Here’s a comparison, Theranos (http://www.forbes.com/sites/matthewherper/2016/10/08/bad-blood-the-decline-and-fall-of-elizabeth-holmes-and-theranos/#739520e17400). Should we have applauded the entrepreneurs behind that company for being willing to devote their lives to unproven ideas or for trying new solutions? Or should we have asked them for factual proof that the technologies they were selling were actually functional?
Phil/David
Meta has put their product out for the public to use. That means that a journal editor or an author or numerous of others in the field can test whether their claims are substantive and useful. This is what is done in the old days when researchers shared their findings and suggested testing the validity for others, along with their opinions. That’s science and how works of others can be validated.
The academic community is conservative, often to protect ideas that might be counter to or challenge an exisiting paradigm. The literature is full of such where the idea is shot full of holes like swiss cheese and then tossed in the dust bin.It’s safe. That is not science.
I doubt whether any of us ask to understand how Google’s algorithms work or other search engines and whether or where there might be bias.
The question that one might ask is whether the introduction of Meta or future products of a similar nature are challenging how scientists work or the publishing industry’s vested interests in maintaining, not peer review, but the current matrix in which that is bound. Phil’s critique as a consultant to the publishing industry is focused on his client base and not the academics who, under further developments, might embrace a vehicle such as Meta as a way of improving the evaluation model.
The recent article about Theranos in HBR is worth a read: http://tinyurl.com/zube6jp
Fortunately, the Meta claims can be tested and then critiqued whereas Theranos never got to that point since it was still a venture play.
Your analogy is flawed. The Meta product is not publicly available, it is available for purchase (or at least as part of a service contract). That means if one wants to test it, one must pay for the privilege, which is not the case for researchers.
Further, this phrase, “researchers shared their findings and suggested testing the validity for others” is exactly what is being asked here by the author of this post — share your findings, which the company has so far refused to do. It’s a bit like a scientist holding a press conference to announce their breakthrough without writing up the paper on it or releasing their data, but then making a product based on the claim available for purchase. In both cases, we should look very skeptically at any such claims until evidence is offered.
That’s how science works, not the promotion of unsupported claims.
I may not have entirely understood what this algorithm is doing, but if it’s predicting citations it must largely be ‘predicting’ which journal a paper is published in – as is often noted here, the best predictor of future citations is the publishing journal. Within a journal, the number of aggrandizing statements the reviewers and editor permitted the authors to get away with probably reflects its perceived importance. I very much doubt that the algorithm has any understanding of the current state of each field of science and how much an individual paper constitutes ‘progress’. It would be amazing if they have achieved this.
This algorithm is perhaps like using (sophisticated) facial expression recognition software to monitor the judges’ faces at a baking competition, and using that to predict the winner. The algorithm knows nothing about the cake (what constitutes scientific progress), and is entirely reliant on the expertise of the judges (the editors and reviewers) to guess the outcome. It then just reflects that information back to us, drawing from its extensive database of delighted or disgusted faces to do a better job of telling us something we sort-of knew already.
We have not seen the algorithm, but given that it is supposed to be used by individual journals, the journal per se cannot be the source of the article impact prediction. Moreover, most of the articles in even high ranking journals receive few citations, while a few receive many, so the journal is not a good predictor of impact of an article.
These exchanges have been based, primarily, on a post by Phil with little details from the company. At the present time it seems less important to know what the magic sauce is inside the software. Rather, there seems to be sufficient interest from the spectrum of readers and commenters in this blog to make a case to the provider to allow an evaluation by one or more individuals who represent potential users of the technology.
May I suggest that one of the “cooks” address a request to the company to allow a test/demonstration and a report to those here as to how this system performs.
Getting a demo won’t be helpful if their algorithm is black-boxed. I need to see the paper, which should have been made available at the time of the press release.
The PR cites an extensive trial with positive results. We need to see the trial method and data, not a demo.
No one is asking for a “demo” but rather a trial that can be conducted by a potential client, preferably a publisher rather than an individual journal.
As David notes, the “magic sauce” of a commercial product is probably proprietary as makes sense. What makes sense from a user’s perspective is whether or not the product delivers as promised and performs to the needs of the end user..
One can understand why a consultant would want to access the algorithm and inner workings.
Go back to the original press release that prompted this blog post. The trial you describe has allegedly already been performed. If the company behind the product is going to use it to tout the efficacy of their product, why not release that information?
Why put together a multi-year set of experiments if they’ve already been done?
Thanks david. My experience with a variety of different “products” particularly in emerging areas goes something like this:
a) nice demonstration but will it do “x”, “y” or “z” or maybe all of these?
b) the company goes back and demonstrates the above only to show now that it does 1,2, or 3
und so weiter
In the case of software, the company is often willing to provide access to a potential client. This does not necessarily require that the client undertake extensive testing, though that might be an option. But rather that there be sufficient interest in a trial.
The other option is to sit back and wait for others to be the first to take the initial commercial plunge and, if successful, be the early adopter with an edge. This is the standard adoption process which may or may not favor first users.
Now, if I were a consultant here, I would find a client interested in adoption, get a contract and head to the vendor, contract in hand, where there are some clearly definable validations which would move the client towards, at minimum, a validating trial. After all, there are other potential competitors to this vendor and lack of adoption can be problematic.
If I were a consultant, I’d need more convincing proof to take the step of engaging one of my clients and asking them to spare precious funds, time and effort on a new product (particularly since there are so many new products and services constantly being offered). Actual data backing up the claims made in the marketing materials for that product would go a long way toward convincing me whether it is worth bothering my clients and putting my reputation at risk.
As Humpty Dumpty says, “Which is to be master”. That’s the risk of a consultant. Right now this is in the hands of the company selling the product to convince a buyer. This was the purpose of the presentation-generate enough interest to get into the buyer’s office. If the buyer wants a filter for all the products being pitched, then they will pay the consultant as a more efficient alternative. The question for the consultant is “what am I trying to sell and to whom?” And therein lies the problem with the original post and the resultant thread.
Speaking from the publisher side of the equation, not a single day goes by that I don’t receive a pitch asking my company to demo some new technology that is going to “revolutionize” some aspect of publishing. Some days I get dozens of these pitches. Very few (almost none) turn out to be worthwhile. To get any further in the door than a deleted email, the seller has to make a very convincing pitch, and that usually includes being able to back up any claims made. If you can’t do that, then you’re just one of the hordes of businesses that I’m going to ignore.
It’s no skin off my back if I don’t do a trial with company X, don’t spend the extra money, don’t spend the time, effort and opportunity cost on a product with an unconvincing case. The company selling the product is the one at risk here, the one that needs to do the work. I don’t regularly work with consultants, but when I do, I want them to maintain these same high standards and not waste my time.
And that hasn’t been done here. I look at a company that makes public claims and doesn’t release the data to back them up the same way I look at a scientist that makes public claims but doesn’t release a paper or data to show that they’re true.