I had the pleasure of speaking at the Annual STM Meeting yesterday in DC on a panel about Trends in Peer Review. My co-panelists included Annette Flanagin, Executive Managing Editor and Vice President, Editorial Operations, JAMA and The JAMA Network; John Inglis, Founder, bioRxiv; and Bernard Rous, Director of Publications Emeritus, ACM.
My topic had to do with crowdsourced review and comments. This topic has been covered in The Scholarly Kitchen here, here, and here; but I thought, what the heck, let’s do it again. This post will summarize my presentation as well as capture some of the resulting discussions.
I was asked to talk about crowdsourced peer review so my first task was to define what that actually means. Crowdsourcing in general means to solicit contributions from a large group of people, particularly in an online environment. Crowdsourced peer review has been defined as “a public review process in which any community member may contribute to the article review.”
There is an inherent contradiction in this definition. The first point indicated that the activity is very wide open to anyone contributing. A Kickstarter projects looks for crowdsourced funding but doesn’t limit who can actually give money (with the exception that those who give must actually have money and be legally permitted to donate it online). The second point restricts the process to a “community member.”
Crowdsourced means open to all. Peer review means restricted to peers. We already have a problem with the concept.
Yet another term has emerged and that’s “post-publication peer review.” This is different from “open peer review,” which may or may not happen in real time after publication. Many journals that are practicing open peer review are doing so by exposing reviewer names, or publishing reviews without reviewer names, or publishing reviewer names and reviews alongside the paper. The actual peer review activity is still mostly being performed behind closed doors and then once accepted, all the laundry (dirty and clean) is aired.
I found a reference to post-publication peer review in 1999 when Kent Anderson introduced this at Pediatrics. He later surmised that this turned out to be nothing more than a new marketing term for eLetters. The language usage to describe reviews today is just as messy.
Once I dove into these definitions and started looking at available examples, one thing became clear: Crowdsourced peer review = post publication peer review = online commenting.
Some distinction could be made to the format of the comment (just a free-for-all box of text versus a structured form of questions and considerations). Despite the variations in format, some open text boxes were as comprehensive as a structured review and some formatted comments were very superficial.
Traditionally, journals have included a way for scholars to address or discuss published papers. This is, after all, a core mission of the journals. Letters to the Editor or Discussions and Closures are one way to respond but there is usually a fairly short time for submitting comments and there can be a long lag time to see these commentaries in the journal. On the plus side, these commentaries are usually reviewed and therefore should contain new, valuable information.
Once I dove into these definitions and started looking at available examples, one thing became clear: Crowdsourced peer review = post publication peer review = online commenting.
Online journals provided a new way to get the discussion going in real time. Some journals tried online comments and declared it a failure for many reasons (low engagement, low quality of comments, moderation required, etc.). Despite mediocre results, journals and third-party databases and platforms continue to try and engage readers in rich commenting online. I will explore a few below.
BioMed Central launched in 2002 with capability for online comments. An analysis done by Euan Adie for Nature’s blog found that after a year, only 2% of the papers in BioMed Central had comments. Of those, 24% were from the author; 22% were journal club type comments; 17% were direct criticisms; and 17% were additional links or citations.
BioMed Central has since moved to a new platform and did carry over comments in some format. I was not able to find any. Commenting does not exist now but apparently something new is in the works. In preparation for this presentation, BioMed Central confirmed that comments on the older platform suffered “low usage and poor usability.”
Later in 2008, PLOS announced commenting. Adie did a similar analysis for PLOS One and found that commenting was more frequent with 18% of the papers having comments. He found that about 40% of the comments came from authors; 17% read like journal club interpretations; 13% were direct criticisms, and 11% were seeking clarification.
So here we had two open access behemoths trying out commenting with varying degrees of success. Both are journal platforms that provided a format for discussion alongside the paper. In 2012, F1000 Research launched and pushed this concept to a new level.
With F1000 Research, papers are published online and then opened for comments AND reviews. Reviewers are solicited and once a paper receives two thumbs up (or green check marks) it is indexed as being part of the journal. Until then, it’s in a weird limbo that’s not quite a preprint and not quite a journal article. I tried to see how many papers had just reader comments but was unable to filter the search. Also, staff at F1000 told me that they don’t have that information. I took a look at a dozen or so of the top viewed papers and found no comments. Engagement appears low.
Despite lackluster success on journal platforms, large databases see some value in adding comments functionality. PubMed Commons launched in 2013 allowing comments on all papers indexed in PubMed. To date, about 6,400 of the 27 million articles in the database have comments. Commenting is restricted to those that have a paper indexed in PubMed — not exactly open crowdsourcing but a huge audience nonetheless.
ScienceOpen is another database that’s trying to take on Web of Science and Scopus. They have accumulated an impressive 28 million records, mostly from PubMed, ArXiv, and Crossref. Anyone can register and comment or review a paper. There is a distinction in that reviews are a typical structured review and comments are just free form. Of the 28 million records, 3 million of which are open access, only 11 papers have reviews (excluding those papers published in their own journal for which they post reviews). I am not clear on the number of papers with comments as it’s one of the only things their impressive search interface does not allow me to filter on.
The rogue commenting site that can’t be ignored is PubPeer. This is not a database of papers but rather a database of papers that people want to comment on. While comments are moderated, it is the only site that flat out allows comments to be anonymous. It’s hard to tell how many people have come to comment on how many papers. This is partly because they are pulling in all the PubMed Commons papers.
The basic question remains: why don’t more people comment on these widely available platforms?
Another complication for PubPeer is that a PhD Student wrote a program called Statcheck that scans psychology papers looking for statistical errors. The intent is to scan 700,000 papers. As of September of last year, 50,000 of these reports and their paper records were dumped into PubPeer. The comment/reports will state that there were no errors if none were found. For some critics, this kind of large scale inclusion of comments that don’t add anything to the literature is a waste of time and dilutes the usefulness of the database.
Researchers and the general public have many other ways to discuss scholarly content. Blogs, discipline-specific news sites, and mass media coverage of published literature may or may not add value to the reader experience.
The basic question remains: why don’t more people comment on these widely available platforms? The short answer is time but here are some other thoughts:
- Some readers are skimming papers looking for a “bit” of information. This is not conducive to writing a comprehensive comment.
- Many comments are left “hanging.” The authors aren’t looking for comments on every platform and they aren’t necessarily interested in responding to comments after they have moved on to other projects.
- Trolls — not worth the time arguing with one comment that is out of left field.
- No reward or motivation to comment
- Very public when not anonymous. Early career researchers or those seeking tenure may not want to engage in criticism this way, though they may be very open to asking for clarifications.
- Women and non-native English speakers who already face biases may be even more reluctant to participate severely limiting the discussions to a singular group.
- Anonymous commenting can easily fall into personal ax grinding.
- Non-anonymous commenting can easily fall into personal ax grinding.
The next question is whether anyone reads the comments. This is harder to ascertain as I have no usage stats on this. That said, journals who cancelled comments for “lack of engagement” and the overall lack of comments on one-offs seems to indicate that people don’t read them.
With reviews of products being one of the most redeeming qualities of the internet, why would readers not find random reviews of articles helpful. This question came up recently — why not have Amazon style reviews for journal articles? I can think of a few reasons why comments/reviews are not helpful to readers:
- Science tells us that we are swayed by comments and reviews. Your attitude as a reader changes once you read the comments. If a product or a vacation spot has 150 reviews, you want to know how many were 5 star and how many were 1 star. You read the 5s and the 1s and figure you have all you need because who has time to read 150 reviews. If a paper has 3 comments, is that helpful? The dataset is too small.
- The comments are only as valuable in the exact moment you read them. What I mean is, when you read the paper today, it has two glowing comments. If you go back in a month, it may accumulate several critical comments. How often are you going to go back?
- Conflicts of interest (COI) are not disclosed. PLOS was the only policy I saw that required commenters to make a COI statement before commenting.
- Identity of commenters and moderation are weak. Most of the journal systems require registration but also make it clear that they don’t validate that you are who you say you are. Further, many do not moderate and depend on community policing to flag comments.
- Time — there it is again. What would you use Amazon-like comments for? Finding references to similar work? Great. Weeding out bad papers whose topics are relevant to your work? Well, that’s not so easy. Do no comments connote that a paper is bad? Do excessive comments? Do you have time to read through excessive comments to watch 2-3 people argue back and forth about something? It may be faster to just read the paper.
What exactly is the goal? Why explore offering comments anyway? For one, it’s part of the fiber of scholarly work. It is meant to be debated and discussed. Opening forums for discussion may encourage collaboration. For publishers, it may make a journal article page “sticky.”
What’s missing from commenting sites specifically on megajournals, database sites, and third party sites is community. While journals build a community, they also protect themselves from outside comments. Discussions happen in closer settings — at conferences, at workshops, through formal journal activities, etc. Societies are building these online communities using tools like Higher Logic or AAAS’ Trellis platforms. Journals could brand spaces on these platforms or look for other tools to build that community around content. Another promising new product is Remarq, which allows editors to pose discussion questions to solicit comments on a specific aspect of a paper.
It feels like there is a lot of re-inventing the wheel. Whether it’s called comments, or crowdsourced review, or post-publication peer review, we have yet to see a successful example of quality comments and high engagement.
What do you think? Should we keep trying to force the issue? Or should we let it go and look for something different? Do you have any examples of commenting on scholarly content that works?
30 Thoughts on "Should We Stop with the Commenting Already?"
Great post, and a great presentation yesterday at the STM meeting in DC.
There are venues where comments work. When I was trying post-publication peer review (P3R) at Pediatrics back in the late-1990s, with mixed success, BMJ was piloting Rapid Responses, to major success. Theirs is still going strong. The general conclusion was that the British culture is more about letters than American culture, but I’m not sure it’s that simple. However, there is an example of commenting that has worked for a long period of time.
When I was at NEJM, the editors and publishers mounted many interactive initiatives, most of which worked very well. From an online Images Quiz, which went from a few dozen interactions when it wasn’t very fun, to tens of thousands per week when it was made fun, we learned a lot of lessons. These motivated the editors to launch Clinical Decisions, a feature that is going strong after I think more than a decade. It is editorially-led, and features polling, guided discussion, and comments. By letting the editors and relevant authors frame the engagement, readers/users can start with a poll, and move into full engagement with the content in a way that is more satisfying and interesting than being stared at by an open comments box. This approach also makes trolling less likely, because the discussion is bounded.
NEJM’s Clinical Decisions actually informed a lot of what RedLink is now doing with Remarq, which you mention. It’s not surprising to me that most “comments-only” approaches struggle or fail. You need multiple points of interaction to draw readers in with different options for engagement, and you need to give authors and editors tools to frame engagement. Remarq offers polling, private annotations, public comments, author updates, article-sharing, article-following, and editorial updates. This allows for a lot of options around creating engagement for editors and authors, and for engaging with content for users and readers.
Commenting is part of interaction and engagement. I think offering more options and more ways for editors and authors to frame and stimulate engagement can really work.
Seems to me a main purpose of inviting comments is to have a lighter, faster alternative to the declining Letters to the Editor form in journals. As Angela described, the individual-article level on the journal platform comment venue doesn’t seem broadly successful. Glam journals like Nature or Science seem to attract high numbers of comments especially on their news features, but the quality of the comments is often in stark contrast to the overall quality of the journal. Lots of trolls.
Traditionally, a Letter to the Editor was a mainstay for readers to point out substantive criticisms and alternative points of view. The authors are given opportunity to reply, with both letters receiving external review. While by their nature they may be highly critical, the discourse is expected to be directed at the work not the workers, a polite tone is expected, and is enforced through the editors’ moderation. Conflicts of interest are expected disclosures, just as with an original article. Letters to the Editor may be in decline, with many journal editors declining to publish them as they don’t help the journal impact factor, demanding payments, requiring unnecessarily onerous processes or not even responding. See for example here, here, or here.
Hopefully some of the newer aggregators such as Remarq or PubMed Commons will stick for I think there is a value to something short of a tedious Letter to the Editor, yet more open than PubPeer.
Ironic to be commenting on a blog questioning the value of commenting.
1) Do we know the types of articles that receive comments? broader interest to narrow and tech specific?
2) Do we know the readership: persons who read in the specialized field or persons who find the articles and others by scans across journals (horizontal rather than narrow focused to articles in the journal?
3) Is there any analysis of the comments with respect just to the article or towards a broader context?
A bit, though the data is old. This is an analysis published on the types of comments made at BMC and PLOS One early on. I don’t have newer data.
You have hit on a lot of the challenges facing post-publication peer review and I believe your conclusion that community plays a central role is absolutely correct. At ScienceOpen we address the issue of identity and moderation by requiring all participants to log in with their ORCID and peer-review or comment with their real identity. We aim to create an civil and transparent communication environment. We distinguish between casual communication and questions about a paper and a more serious “peer review” format to make clear that we hold peer review to higher standards. To peer review an article on ScienceOpen requires 5 published papers and in our form we have a field to declare conflicts of interest. We have now begun to address the creation of communities with post-publication curated collections across journals. But there is certainly still much work to do!
Thank you for those clarifications. Just to add, I can register to use the site without providing an ORCID but I cannot add content such as a review or comment without one?
Yes, that is exactly how the registration process on ScienceOpen works. You do not even need to register at all to use any of the search and discovery functionalities on the platform. But to add content such a comments or reviews requires an ORCID for transparency.
If you are comfortable citing “some critics” without sources, then you should also be OK with anonymous comments. Same thing. The phrase “some critics worry” in unattributed sentences in everyday journalism is a well-known ploy to interject reporter bias.
Thank you for excellent summary and discussion. There seems to be a fundamental tension between those who wish to make science more open and those who wish to make it more meaningful. Setting the right barriers to entry seems key. Nevertheless, they need to be specified, through policy, for each venue. F1000 may deliberately set a very low barrier for participation, while a top-tier medical journal may set a very high barrier. With such language ambiguity–which is likely intentional–using words like peer review and comments interchangeably is not helping the overall cause.
fwiw, the commenting I see on Scholarly Kitchen articles is about the most on-point of any publication I regularly read. Comments here are frequent, insightful, and relevant. Whether that is due to the pre-existence of an engaged community, or the size of the readership, or the format/tone of the articles themselves, or something else, is worth thinking about.
Finally, a thought: if comment functionality were removed from your online journal content, would your authors miss it? Would your readers complain? Would your editors miss it?
Sites that allow readers to comment about a piece on their own social media accounts (eg ‘share/comment about this article on facebook/twitter, etc’ seem most appropriate. This allows social media platforms to host comments, leaving the scholarly publication to be purely devoted to truly peer-reviewed content only.
I think mentioning channels (in this case, social media) is an important angle not much addressed in the original article. I don’t find comments generally helpful (or, given the post, much used) probably because it seems a poor way to interact with the paper, given the format and context. Put it on a social media channel, however, and the kind and amount of interaction/engagement is filtered by the channel, so the expectations are different. Following this approach could “free up” publishers (or orgs) to focus solely on more substantive ways to facilitate engagement, like Remarq and Open Science seem to be going towards.
I would say that comments in bioRxiv are an example of comments that work.
Perhaps Richard Sever could comment (:)) on that. My impression is that only a relatively small portion of the posts get comments, but the comments are pretty good quality.
I would suggest two examples in bioRxiv where the comments appeared (to my less-than-scientific eyes) to be important in raising quality and/or extending communication. One of the examples was last year’s post about the cell-phone radiation study in rats. That got huge media coverage, and is one of the highest-altmetric-score items in scholarly publishing. But you can see in the comment thread that people worked together to calibrate the study and raise specific issues. You could almost see that within four days, the wild swings in opinion about this paper was resolved by the community.
The second example is a recent one about a particular piece of equipment and how some results contamination might affect huge numbers of researchers. The post included enough detail (because it was a paper!) that people could interpret and share interpretations. (Again, I’m not qualified to judge the paper itself.)
John Inglis reported yesterday that about 10% of the papers in bioRxiv have received comments (that seems relatively successful). The scenario you point out is in interesting case study. The “concern” (in quotes because I am not in a position to make a judgement on whether the concern is valid), is that preprints in the biomedical realm put studies not subjected to peer review in the hands of everyone. So here an article not reviewed was given significant mass media coverage, prompting loads of eyeballs on the actual paper and “peers” willing to ask questions and along with authors already engaged in defending their work to the news outlets.
For those platforms and journals doing commenting, it would be nice to see how those paper fare with usage and media coverage. The average paper is not promoted in any meaningful way.
Good point — if altmetrics are important to you, then a preprint that gets lots of attention isn’t going to be met with the same reception when it is finally published. It would be interesting to see some attempts to reconcile altmetric scores between preprints and their published version just as there is a system for DOIs.
We’ve seen some really tremendous commenting/PPPR for a few high interest papers over the years. The “arsenic life” paper is another great example. The problem though is that you have a few high profile papers that draw tons of attention and lots of commentary and review, and then you have the remaining 99.9% of papers that don’t. There’s a selection for glam and flashy that seems to take place.
It’s an incentives issue of course, as is much of other stuff in scholcomm that ‘don’t work’. In other words, cease spamming review invitations and see how many apply to review voluntarily without any pressure applied.
I’m not sure you have the causality in the right order here. In general, without a concerted effort to drive the peer review process, it doesn’t happen. As an example, go count the number of papers sitting for years on F1000 Research with incomplete or no review at all. Note how few papers on biorxiv get any comments at all, let alone a thorough review. So without all that “spam”, very little would get reviewed.
And from a practical point of view, if I don’t email you to ask you to review a submitted paper, how do you know that it exists so you can volunteer?
My point exactly. The same applies to commenting, as in, if you don’t point me to an article and specifically ask for a comment, what are the odds I’ll (1) read it in the first place, and (2) comment on it.
But let’s suppose I do somehow stumble upon it and read it though, what then?
Well, if it’s an okay paper without any glaring mistakes or omissions what am I even supposed to say? ‘Good job, author’? A useless comment all in all, unless I’m in the business of backpatting or sucking up to peers. Not a good look either way.
Angela’s post somewhat reads to me as if it supposes that each and every article is worth commenting on, and unless each and every article is indeed commented on, then the very concept of commenting is worthless and can be safely abandoned. In reality, it’s always going to be either a controversial, or a flawed paper that gets commented on (assuming no solicitation for comments by the publisher or author).
That’s their utility. If you want higher level of engagement in the comment section you gotta put in work.
I think we’re in agreement and that I misinterpreted your original comment. The reason Angela’s point is valid is that we regularly hear suggestions that we should do away with the ordered and managed process of pre-publication peer review and replace it with post-publication peer review (just post everything on the internet and let the crowd figure it out). As you note, other than a few prestige papers, this would largely mean an unreviewed literature.
Yeah, we largely agree on this one. However, seeing as I value dissensus more than consensus, I must counter still: were we to apply the same kind of order and management to PPPR it would likely be at least as viable as PPPR, with the added benefit of avoiding all the revise/resubmit back-and-forth.
Not quite. My point is that online commenting in all forms has not been particularly successful in terms of quantity and quality. Engagement and discussion happens in different ways that the individual research communities have deemed worthwhile. Yet, publishers, database platforms, and third party platforms still plunk basic commenting functionality onto journal articles. Add that to this desire to rename everything (crowdsourced peer review) and it gets no better. Either engagement in a public forum is not desirable to the research community at all, or we are giving them the wrong tools to do it in. For example, do we have statistics about how many articles are discussed using shared annotations in reference management sites? In this case, smaller groups of researchers may be having very rich discussions about research. This may or may not include the authors. It is not public. If that’s the preference, then we should facilitate these activities instead of public commenting.
If readers are already having rich discussions in private, and prefer it so, is there a real need for publishers to rein in those discussions then? I mean I see how a publisher would benefit from it, but that way those same readers would surely lose a bit of that privacy that they supposedly cherish.
Either way, unless the commenting functionality is a significant resource drain for the implementer, there’s no harm in making it available. Marketing it as some new and revolutionary tool/concept is annoying, but then again it’s an understandable attempt to try and drive user engagement seeing as people like new and shiny things.
I agree, David – there are many papers where the invited Peer Reviewers are the first (and for some the only) colleagues who give a paper a truly thorough read and make the effort to provide comments that are rational, constructive, and with the goal of improving the paper. (Allow me my ideal world for a while on that…)
Writing a paper is work – but it contributes to one’s own professional benefit.
Writing a review is work as well – but it is a way of contributing to the community as a whole.
Generalized PPPR? It might be reputational – because the people whose papers you review could well appreciate your input and seek you out in future.
When it snows, I shovel my driveway because I need to get out.
I shovel the sidewalk because it is my obligation to my neighbors and they mostly shovel their sidewalks too.
I don’t usually drive across town, pick a house and shovel THEIR driveway. Even if it’s a nice house…
I think perhaps we too-quickly go to the (mis?) use case of commenting as a substitute for peer review. In part we might focus on this now because some new publishing platforms are “Publish, then review”. But there are a number of other use cases.
Several commenting use cases of interest might rely on non-public commenting. Some of the use cases are ones discussed with Hypothes.is:
1. Personal annotations
2. As a way to collaborate with a small team on articles, etc.
3. Grad students journal clubs
4. As an overlay for journals
5. Authors annotating their own work
6. Fact checking (e.g. ClimateFeedback use case)
7. Classroom uses
When HighWire conducted interviews with researchers at Stanford, we heard from people that they wanted *expert* commentary. That is, they wanted the bar to who could comment to be set above “members of the public”. There are some systems for achieving this (PubMed Commons, iirc) — and systems that require ORCID to write comments are an approach to that as well.
The most successful commentary service in STM journals is BMJ’s, as far as I know. That is, measuring on the volume of comments, it is hugely successful. They publish more comments than they do articles. They do have guidelines and they do moderate for them (the guidelines have gotten a little more strict with their success; they now filter out boorishness, repetition and self-promotion, in addition to libelous and obscene. 🙂 I don’t know of any serious study about why their service is so successful (again, judged by volume), but I suspect at least part of it is that BMJ is a journal that has long had a very active Letters to the Editor section in print, before we put “eLetters” online around 1998/99. So there was an active community predating the online journal.
Good comments on this post, Angela!
More filters are better than fewer filters. That’s why it’s always best to think of hear things as additive rather than substitutive.
John, you’ve touched on what would seem to be a major factor behind the success of the BMJ Rapid Response framework when you talk about a well-established culture of ‘Letters to the Editor’. But that in itself is a symptom rather than an cause, I would suggest.
With the BMJ you have the involvement of a (relatively) huge constituency of practitioners, not academics. As doctors, they are highly trained and educated, highly motivated (we hope) and dealing with matters of the utmost importance to their patients and their own professional careers. It sometimes really is a matter of life and death.
You also have, among patients and their advocacy bodies, some highly motivated non-professionals who have developed significant levels of knowledge and understanding.
In short, it is a somewhat different world.
That still leaves open the question of why the BMJ venture appears to be more successful than any of the other big five medical journals. I’d guess that it comes down to the vision which underpins the commitment they brought to implementing their system – for which BMJ is to be applauded.
This article nails the point that I have arrived at, which is that whole article commenting may indeed be the wrong framework & the market is ripe for trying annotation as a way to address the failings of whole article commenting:
Annotations are more targeted, this requires less mental effort to be useful
Annotations are less likely to be controversial, since you’d be making a remark in a specific part of the paper, which promotes factual statements
All the the above means that annotations could be present in greater abundance, especially if you could find comments made on an article but not on your site because it used the open annotation standard
It’s certainly true that not all articles are worth passing judgment on, but many more specific things that could be said about specific parts of them.
My last point is that community development isn’t something that I have seen many publishers doing well. It’s strange, because editors know their communities so well, but I haven’t seen a lot of effort put into translating that into engagement at the journal site. Of course, if you’re using the open annotation standard, engagement doesn’t have to be on your site, created de novo, you can leverage community where it already exists, and a brief perusal of the 23andme or ACOR forums show there’s a lot out there!
I wonder whether it just may be too early to call. Having fully accepted PPPR and commenting, even annotation, would be a major shift in scholarly communication, at the aggregate as well as at the individual level. Culture and technology still have to adapt.
To start with the first, having your thoughts on a piece of work online, in perpetuity, instantly and globally available is incomparable to sharing the same thoughts at the coffee machine with your fellow graduate student. This is about reputations that take a long time to build but are easily scattered. There are many important subtleties in the proces of online commenting in academia that we probably do not yet fully understand. There probably is a lot of uncertainty, fear and anxiety, especially for first time commenters and when being the first to comment on a piece. When asking a question you run the risking of seeming foolish, when making a critical remark you may damage the reputation of those in power that decide on grants and tenure. There is little consideration for making mistakes and showing doubt. We have to understand why all those on-topic and smart thoughts shared at the coffee machine don’t make it to the academic platforms Angela mentions. I suspect it has to do with the lack of an open online collaborative culture that accepts doubt and failure as part of the proces, but I cannot put my finger on it exactly.
It may also be early days in terms of technology. Comments and reviews are often not citable, not credited, not indexed and searchable, not interoperable. I cannot easily retrace all the papers/posts I commented on, let alone add those comments to my ORCID. We do have laudable initiatives working a a universal commenting layer, like Hypothes.is and The Pundit, but they are still relatively unknown, also face interoperability issues and may be in need of some form of (community) moderation before being widely accepted in academia.
Finally, perhaps commenting on a specific paper is not the most efficient way to have scholarly discussions. Perhaps we should encourage the online discussion around specific (unsolved) questions and problems and learn from the way that is done in the many science sites at StackExchange (https://stackexchange.com/sites#science). Most sites there have (tens of) thousands of questions with 75-95% of them receiving answers. Of course those answers can cited and comment on recent papers.
So, in conclusion, I would suggest not to ditch the idea, and also not to wait and see, but to experiment more, build on what does work and support brave young researchers who do engage in commenting, annotating and discussion.