What Has FundRef Done for Me Lately?

I like data. I also like standards for data. I stand behind efforts to use standards to maximize the usefulness of data.

Publishers have taken it upon themselves to ensure that funding information is not just communicated in the acknowledgments of a paper but that there are standardized controls of this information built into an article’s metadata. While I question the need for publishers to make it easier for funding agencies to keep track of their funded works, this framework will be immensely helpful in assuring compliance with public access mandates.

When the OSTP memo was released, I set off on a fact finding mission to see how many papers the American Society of Civil Engineers (ASCE) publishes that result from US Government funding. The path to getting this information was ridiculously difficult.

I received an Excel file with the acknowledgements sections of over 900 recently published papers across all ASCE journal titles. I then read every acknowledgment, parsed out the declared funders, looked up those that were abbreviated so I knew what they were, and also noted the country/state of the funder. After this was done, I grouped the papers by MAJOR funder, meaning that Federal Highway Authority was now noted under U.S. Department of Transportation, etc. This required that I look up a lot of the funders to see which U.S. agency oversees that group.

This process was long, tedious, and fraught with omission and errors but it was the best I had at the time.

In order to efficiently link research dollars to research papers, FundRef was born. After a pilot for about a year, FundRef officially went live on May of 2013 and depends on a taxonomy of funding agencies provided by Elsevier.

I was tickled pink when our online submission system announced that it was now integrated with FundRef. We took a look at it and were a little disappointed in the search/drop down options. Non-preferred terms and common abbreviations were an issue. An author entering DOE, for example, may not see U.S. Department of Energy on the drop down list of 10 terms being shown. Typing in “dept of anything” also leaves a blank. The author has to start typing in the entire non-abbreviated term to get the correct agency.

Okay, so it’s early. The drop down list is sure to improve. All in all, the major agencies that we see most frequently noted on papers are represented, including those outside the U.S.

We turned FundRef on and made it a requirement for all authors upon submission. After a few months, I got my hands on the first report of data using the FundRef plugin and taxonomy.

What a let-down. What I had in front of me was 3,000 rows of non-standardized, unhelpful information. Just within the first 50 papers, I saw 6 different ways to identify the National Natural Science Foundation (NNSF) in China. The bad news here is that my previous review of the acknowledgements showed that authors funded by NNSF were pretty standard in how they declared funding.

We have papers funded by “usm” and “SHRP2”. This is how the funders were identified in the FundRef box. This is because authors are invited to add whatever they like and ignore the drop down list, which they are apparently doing, particularly if they don’t see the agency they are looking for in their first try.

I understand the need for the widget to allow for authors to include funding agencies not on the list already but I question whether there is a better interface for collecting this kind of information. Might there be an option where authors can say that they have not found their agency on the list and then input the information in a separate field for further inquiry? To my knowledge, the only way to add agencies is to send an email and request an agency be listed.

I know we are not the only publishers reporting trouble with rectifying what the authors enter in the FundRef box with what they have typed in the acknowledgement section. This has come up at CHORUS implementation meetings.

Looking at the data provided by CrossRef on the FundRef information page, 57% of the DOIs deposited with funding information have funders that are not in the extensive taxonomy of nearly 8,800 funding agencies and as such, these will not show up in the FundRef search.

FundRef is still in its early stages, for sure. This initiative improves dramatically when publishers actually start signing on and depositing information. CrossRef talks about FundRef in terms of DOIs for funding agencies, except we aren’t there yet. Here is what I would like to see:

Each and every grant would have a DOI-like identifier. Each agency would be assigned a prefix, just like publishers are with the DOI and other information, including a grant number, will follow.
Each agency would then deposit to FundRef the grant number with the name and email of the Principle Investigator or grantee.
Journal submission systems would collect the grant information as prescribed and have a way to check it against the FundRef database, just like we can validate references today. Any anomalies could be queried prior to publication.
The grant “DOI” would be deposited along with the metadata for each journal article to CrossRef at time of publication.

I know what you are going to say…funding agencies won’t do this. Hogwash! The whole point of this is for funding agencies to show what they got for the money they spent. Publishers are not going to be able to resolve this alone. FundRef is a good start but the funders need to claim a portion of this process if they expect to see the benefit.

The part that really makes me nervous is the dependence on the FundRef tools working in order to participate in CHORUS. Publishers who participate in CHORUS to comply with forthcoming and existing public access policies for federally funded works will need to have FundRef data deposited with their DOI metadata in order for CHORUS to properly identify the funder. For my department, this leaves us with no other option than to use the acknowledgements and have copyeditors/taggers add the correct Funder ID to the XML of the articles. This process will surely lead to more author queries and more corrections in the proofing stage.

All of this is, of course, possible. But boy it would be nice if it were automated in some way.

Building the infrastructure for connecting research grants to published outcomes is not easy and initiatives such as FundRef are critical. If there is a way to make sure that the process improves—for authors, agencies and publishers—we all need to be helping it along.

Public access mandates are coming our way and identifying funding agencies will be critical. Before all of this becomes the law of the land, perhaps we need to get some of the kinks worked out.

Angela Cochran

@acochran12733

Angela Cochran is Vice President of Publishing at the American Society of Clinical Oncology. She is past president of the Society for Scholarly Publishing and of the Council of Science Editors. Views on TSK are her own.

Discussion

37 Thoughts on "What Has FundRef Done for Me Lately?"

I am sure that Angela has previously put her criticisms to CrossRef. They are very dependent on feedback from the field. They have a good track record of responding. On that assumption what has been their response and what are they planning to do?

Anthony

By anthonywatkinson
Jan 28, 2015, 5:52 AM

DOI’s for grant contract numbers are a great idea for the US Public Access program. Standardization facilitates both article collection and agency performance evaluation. The primary obstacles are administrative, especially the fact that each agency funding office often has its own numbering system, nor do these numbers presently identify the funding program. Also the contracting offices may have little interest in public access, so the decisions to standardize contract numbers will have to be made at a relatively high level.

Note too that many funding agencies have multiple funding offices, which in turn have multiple funding programs. It is this Byzantine structure that makes federal funders so hard to identify properly. Standardizing this structure will not be easy.

It can be done but it will take time and effort so the publishers need to gear up a long term push. This article is a good start. The place to start federally is probably the National Science and Technology Council (NSTC) which includes all the funding agencies: http://www.whitehouse.gov/administration/eop/ostp/nstc.

However, the funding agencies have an alternative, which is to simply require that the authors send in copies of their articles, the way NIH presently does. Given that so far, and after almost two years, only DOE has announced a FundRef-like Public Access program plan, it may be too soon to make this push beyond DOE. I would start with them, as a kind of pilot effort.

By David Wojick
Jan 28, 2015, 6:45 AM

I realize the agencies can go the route of just having authors deposit stuff in a repository, but this has not proven fruitful in the past. PubMed Central has been successful largely because the big publishers deposit papers for the authors. This is not exactly an uncomplicated process. It’s a great service to offer the authors but it shifts responsibility to the publisher for making sure that the record is delivered and posted accurately and on time. These same big publishers are the ones funding CHORUS. I have no idea why they would continue to deposit papers to PMC once an alternative is available. The CHORUS integration will be much more streamlined, incorporate all agencies, and keep traffic on the publisher site.

By Angela Cochran
Jan 28, 2015, 9:02 AM

I agree that CHORUS is an elegant solution, but all the agencies may not go for it and it does have some shortcomings. For one thing, it is no where near complete as far as having all the publishers who publish federally relevant papers as members, and it may never be that complete. So the agencies still have to collect all the articles that CHORUS does not link to, which means two separate ingest processes. DOE PAGES actually has three distinct ingest processes, which creates significant configuration control challenges. The fact that some CHORUS members are posting accepted manuscripts instead of VoR’s creates additional complications. There may even be a way for some agencies to collect nothing, for example by implementing a third party repository requirement.

But as for PMC I do not see them giving up their present physical copy collection system, which is deeply entrenched administratively. That the publishers might try to force PMC to switch to CHORUS, by refusing to continue to supply the relevant articles, is certainly a bold concept. If all or most of the agencies besides NIH opt for CHORUS then there might well be a case for it.

So we really have to wait to see what the various federal agencies opt for.

By David Wojick
Jan 28, 2015, 2:26 PM

First, let’s make this very clear: Neither Angela, nor anyone from CHORUS is suggesting in any way that PubMed Central needs to change or go away. In fact, it is required by law to exist, and thus an act of Congress is necessary for it to go away. There has been a great deal of fearmongering claiming that CHORUS is some sort of nefarious plot to destroy PMC, and this is absolutely untrue.

I have not heard of any publishers planning to cease deposit in PMC, with the exception of the British Medical Journal, whose editor declared at a recent meeting that they were considering it given that PMC is increasingly seen as a competitor in the market.

By David Crotty
Jan 28, 2015, 4:44 PM

My comment addressed the possibility of PMC using CHORUS to link to articles, rather than physically collecting them as it now does. Hopefully this is not ruled out by law. If it is then the law is ridiculous and needs to be changed, which is certainly feasible.

By David Wojick
Jan 28, 2015, 9:11 PM

See the Consolidated Appropriations Act of 2008 (H.R. 2764) which requires that investigators funded by NIH submit (or have submitted for them) to the National Library of Medicine’s (NLM) PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication in a manner consistent with copyright law.

By David Crotty
Jan 28, 2015, 10:04 PM

This law does not rule out PMC using CHORUS and posting links instead of articles, while keeping the collected articles as a dark archive. The purpose of this law was probably to give NIH statutory authority for its Public Access program, something the other agencies may in fact lack, not to hardwire a technical approach to public access. CHORUS may now make this physical delivery approach obsolete. If all the other big agencies post links instead of articles then PMC will be under great pressure to do likewise. In fact PMC could probably simply interpret this law such that delivering links was deemed sufficient to satisfy it. Executive agencies have this sort of discretion.

Note however that public access links need not come from CHORUS. For example, DOE PAGES is collecting links to articles in repositories. Links could also come directly from publishers, the way PMC articles presently do. The point is that the US Public Access program design is extremely fluid at this time and will be for several years at least. But to return to Angela’s original point, it is imperative that the funder identification data be collected both accurately and efficiently. This data is the key to a link based system.

By David Wojick
Jan 29, 2015, 6:33 AM

Given that the act specifically calls for the deposited version to be made publicly available, this would seem to preclude PMC serving as a dark archive.

By David Crotty
Jan 29, 2015, 9:04 AM

The agencies have what the Courts call broad discretion in areas like this, so I think you are reading it too narrowly. If the head of NIH or NLM decided that PMC should post links instead of articles then I am sure that would be done. All NIH need do is interpret the link to be delivery of the article. We have a somewhat analogous case with NSF and funded project reports. DOE makes theirs public but NSF does not, and they can be quite lengthy and useful. The COMPETES act specifically directed NSF to make its project reports public, but instead they created a brief summary report and made that public.

By David Wojick
Jan 29, 2015, 10:03 AM

PMC will not and should not go away. I don’t see how the publishers can force anything! Where the articles go as part of a mandate is not up to the publishers. My point is that publishers make it very easy for authors with NIH funding to comply with the mandates because the publisher does the deposit for the authors–not in all cases but in many, many cases. This is where CHORUS is helpful, the authors need only identify their funding accurately, and the rest is magic to them. Unless the publishers somehow screw this up.

By Angela Cochran
Jan 28, 2015, 4:53 PM

I never suggested that PMC might go away, Angela, just that the publishers might try to get PMC to adopt CHORUS, which I think a good idea. Apparently I misunderstood this statement of yours: “I have no idea why they would continue to deposit papers to PMC once an alternative is available.” So perhaps you can clarify what you meant.

By David Wojick
Jan 28, 2015, 9:05 PM

Hi Angela,

I’d like to clarify a few points about FundRef and address some of your concerns.

The issue you describe seems to be more to do with the user interface of your submission system, and is not inherent to FundRef. When CrossRef launched FundRef we provided a widget (http://labs.crossref.org/fundref-widget/) that could be used to collect data from authors. You will see that it does provide the functionality you describe above: autocompleting funder names and recognising abbreviations (try typing “doe” into the search box). Some submission system vendors have chosen not to use this widget and are providing variations on this functionality. Those implementations that are less flexible are clearly collecting less accurate data from authors. CrossRef very much encourages any submission system to make use of the widget and also to provide very clear instructions for authors on the importance of accurate data entry.

The CrossRef widget also includes the option for an author to enter a funder name if they cannot find their funding source in the FundRef registry. This is in fact the primary route for adding new funders to the registry; when funder names are deposited without IDs, CrossRef adds them to a list of potential new funders and after validation these names are added to the Registry.

To speak to your suggestion that the system would work better if funders deposited grant numbers and publishers then checked against this data: this and other options were discussed at length ahead of the pilot, not just with CrossRef member publishers, but also with a number of funding agencies. (Agency involvement has been key to FundRef from the very start and the FundRef Advisory Group includes representatives from the NSF, Wellcome, NASA, and the DOE). It quickly became apparent that the agencies were not in a position to do this. Each agency has a different system for assigning and tracking grant numbers, none are CrossRef members or have the means (or indeed incentive) to conform to a common schema and start depositing this data. Publishers, on the other hand, have been managing, editing and depositing metadata with CrossRef for many years. Adding the collection, normalisation and deposit of funding data to this widespread industry practice made much more sense.

As for the assertion that “The whole point of this is for funding agencies to show what they got for the money they spent”, the FundRef advisory committee would probably disagree. Publishers and funders realised early on that a system for collection of funder information was, in fact, mutually beneficial. Funders already have robust mechanisms for tracking how their funding is spent. They get these in funding reports. What they don’t get is a clear picture of the longer-term ramifications of that funding: the publications and research objects or by-products that are subsequently produced.

And yes, funders will benefit from being able to measure compliance with mandates, but so too will publishers. Publishers, in fact, see quite a few benefits from the system. They want to help authors comply with mandates, but also, knowing where the money is coming from that is funding the research underlying one’s publications is important business intelligence.

We should note that a number of publishers are depositing 100% accurate funding data with CrossRef. I am in the process of undertaking some analysis and talking to participating publishers to find out why and how these publishers are collecting (or extracting) data that is accurate and standardised, while others are facing difficulties. CrossRef intends to provide a series of guidelines and best-practice documents and webinars to share our findings and offer help to our members and their vendors. As you say, building the infrastructure for connecting research grants to published outcomes is not easy and initiatives such as FundRef are critical – that’s why CrossRef responded to industry demand for such a service.

I would of course be more than happy to discuss any of this with you directly, and to see if there is anything we can do to help before you start depositing the funding data you have collected so far.

Kirsty

By Kirsty Meddings
Jan 28, 2015, 8:28 AM

I think ACS has done the most research on this funder data quality problem, beginning with the CHORUS pilot. As a result they have opted for a collection process based on manually mining the acknowledgements rather than depending on the FundRef interface. Their process is described here: http://www.chorusaccess.org/wp-content/uploads/OBrien-ACS-CHORUS-workshop-v3.pptx. Last I looked ACS was the only publisher feeding data to DOE PAGES. I have been tracking this data quality issue for some time and it is not trivial.

By David Wojick
Jan 28, 2015, 8:53 AM

I believe this is inaccurate. ACS uses ScholarOne for manuscript submissions and collects funder information upfront. They have internal processes to check the quality of the metadata collected at various stages and then ask the author to check the information as part of the galley proof process. APS and Elsevier, on the other hand, do not attempt to collect funder information upfront at all, but instead extract it from the acknowledgments directly and also ask the authors to check during the proof process.

I am not sure what to make of your comment that ACS was the only publisher feeding data to DOE PAGES. PAGES collects data about published papers using the metadata in CrossRef and then by going directly to publisher web sites and services based on the full text URLs in the metadata. There are certainly other publishers depositing the requisite metadata with CrossRef (FundRef, licensing information, and full text URLs for the version of record or accepted manuscripts).

By Mark Doyle
Jan 28, 2015, 10:05 AM

Mark, As of two weeks ago almost all of the articles found in DOE PAGES since it went live on Oct 1 seemed to be from ACS. There may have been one or two from AIP or APS, I forget which, plus six from the labs. I have no idea who is depositing what where but after almost four months nothing is showing up in PAGES. Perhaps the problem is at DOE’s end.

As for your first para, you seem to be saying what I said, but in more detail. Where is the inaccuracy?

By David Wojick
Jan 28, 2015, 10:24 AM

Thanks for the detailed response Kirsty. I still have some issues though.

I suppose it could be the case that our submission system is not using the widget but during our testing phase, we plugged the same agencies into the FundRef search, the only validation available to us, and the results were the same.

As far as adding to the taxonomy, I would like to see more about how the widget works. On our system, there is only the one field so authors are just typing in whatever they want. There is not a separate field for collecting agencies that are not appearing. Do you have any information about usage and whether authors are finding what they are looking for via the widget without the publisher having to parse the data from the manuscript? In other words, for those publishers with 100% accurate data, is the data accurate because the publisher is doing it manually? Are these publishers that have nearly all funding from a few sources? I look forward to your report.

I agree that publishers should deposit funding data with the CrossRef metadata. My concern is the collection of this information in the first place. There is no validation that the author actually has the funding or that the funding claimed actually applies to the paper. This may not matter for the service but it matters to publishers that are now being expected to make the paper available for free based on the funding. Further, the correction of papers that have the wrong information is unclear. Will metadata need to be redeposited or will an errata suffice?

Anyhow, I am not arguing that FundRef is a waste of time. On the contrary, I am arguing that it is not working out so great for publishers…yet. You say that publishers benefit from collecting funding information but we already have that information and do not need FundRef to find this out. The benefit for publishers is that we can offer a service to authors who are struggling with complying with mandates that we don’t even know about yet.

I just don’t think we are ready for prime time and prime time can start any minute now. If we, as a community, can help to make it more functional for what is actually needed, then let’s talk about doing that and see where we get.

By Angela Cochran
Jan 28, 2015, 9:20 AM

“Do you have any information about usage and whether authors are finding what they are looking for via the widget without the publisher having to parse the data from the manuscript? In other words, for those publishers with 100% accurate data, is the data accurate because the publisher is doing it manually? Are these publishers that have nearly all funding from a few sources? I look forward to your report.”

This is what I’m currently looking into. We know that some publishers (David mentions ACS above) are not asking authors but are instead extracting the data from manuscripts, but even amongst these publishers the quality of data varies. Others are asking authors but doing some in-house validation ahead of deposit. By understanding more about the different processes and their results we should be able to better define some best practices.

“There is no validation that the author actually has the funding or that the funding claimed actually applies to the paper.”

I’m not sure that this is the issue that FundRef is trying to address at this time – the same can be said for whatever appears in the acknowledgements within the manuscript. Getting the funders named in the manuscript to match the funders in the deposited metadata is the first step. We have discussed grant validation within the advisory group but agreed that this isn’t in scope for FundRef yet. We still need to get the basics running more smoothly for everyone involved, and we are very open to suggestions.

Out of interest, could you send me some examples of funders you couldn’t find in FundRef Search? The search isn’t really meant as a data validation tool, but it does reference the registry, so I’d like to work out whether this is down to UI issues or genuine gaps in the list of funders.

By Kirsty Meddings
Jan 28, 2015, 10:06 AM

I can pull some information for you. It may take me a few weeks. I am also happy to send you the list of funders already collected and reported out. I am not sure that is helpful.

I can certainly see where verification of funding is not in the FundRef purview. That said, errors in grant numbers, which we for one were not actually collecting prior to using the FundRef tools, may be extensive. IF information on grant numbers were already in a database, per my “dream world” scenario, a validation of that information would be useful.

By Angela Cochran
Jan 28, 2015, 10:15 AM

Kristy, does the CHORUS search feature use the FundRef taxonomy? See http://search.chorusaccess.org/. My impression is that it does. If so then readers can use it to get a feel for what we are talking about, as far as finding funders is concerned. It is also a useful tool for looking at what CHORUS provides.

One other important issue with funder data that is directly related to the taxonomy is hierarchical aggregation. Funding agencies have hierarchical structure which the taxonomy attempts to capture. For example, DOE is listed as a funder, but so is their Office of Science (SC) as well as that Office’s Basic Energy Sciences (BES) program. The latter provides over a billion dollars a year in funding, but is just one of SC’s major funding programs.

The problem is that if some articles are identified simply as funded by DOE, some by SC, and some by BES, then we have no way of knowing all of the articles that are funded by BES, or by SC for that matter. This may not be important for article collection by DOE PAGES, but it makes program evaluation difficult, perhaps even impossible. Has there been any thought to requiring that all funders be identified at the most detailed available level? That might solve the hierarchy problem. I expect that program evaluation will emerge as a major use of funder data, because the programs themselves compete fiercely for funding.

By David Wojick
Jan 29, 2015, 7:18 AM

CHORUS Search is built from FundRef Search (http://search.crossref.org/fundref) and yes it does use the FundRef Registry. Within the hierarchy all results roll upwards, so a search on DOE will include all of the papers funded by SC and BES. A search for BES will return only papers where the funding has been cited at that level. We have been guided by the agencies themselves as to what level of granularity should be present – some would prefer to see articles citing the most granular level available, while others are happy for authors to simply cite the top-level agency. Of course, whether authors identify correctly with the program, office, or agency is a different matter…

By Kirsty Meddings
Jan 29, 2015, 10:44 AM

Thanks Kristy, my conjecture is that the agency offices that are building the public access systems, such as DOE’s OSTI, are indifferent to the program evaluation role that detailed funder data is likely to come to play. In the long run the granularity will probably have to be standardized across the government, perhaps at the most granular level. The programs should want this so that all of their articles get counted, with more being better. It is almost certain that the metrics research community will start doing impact analyses of the offices and programs, as they presently do for other institutions.

By David Wojick
Jan 29, 2015, 11:05 AM

You say that publishers benefit from collecting funding information but we already have that information and do not need FundRef to find this out.

We have it, but it’s not an easy process to pull that information out of the paper, compile it and put it to use. As FundRef continues to improve, it offers us a chance to automate the process, saving time and effort over pulling that information out manually or requiring authors to use particular phrasings that can be searched for automatically.

One other point worth mentioning is that FundRef goes way beyond just CHORUS and access to papers. In my opinion, it’s a key factor for success in any data availability policies that may emerge from funders. Deposition of data looks like it’s going to be widely distributed, in a large number and variety of repositories. Linking datasets to funder information will vastly improve compliance and tracking.

By David Crotty
Jan 28, 2015, 10:18 AM

We could ask for the information in the submission questions and get about the same level of consistency as I got using the FundRef tool. I agree that it will get better. I want to know how we can get there…quickly.

The other issue is an “early adopter” problem. Initiatives like this and ORCID require wide-spread adoption. But, when we put it to authors before the user experience is streamlined, they get turned off. It’s a chicken and egg problem.

By Angela Cochran
Jan 28, 2015, 10:25 AM

I agree. However, having a taxonomy of many thousands of funders creates a significant Selection burden on the users. This may be where some of the quality problems are coming from and why the Funder data sometimes does not match the acknowledgements.

By David Wojick
Jan 28, 2015, 10:29 AM

Another selection and quality problem is that authors may acknowledge funding in their article using the name of a research program rather than the name of the organization funding that program. Since the FundRef taxonomy is of organizations, a program name would not be in the list. This needs to be explained in the instructions to the authors so that they don’t enter the program name as a “new” funder name and do select the appropriate organization name. A future solution could be to include program names in the registry associated with the funder name. But that could be a major data collection and management challenge.

By Evan Owens
Jan 30, 2015, 9:06 AM

Indeed Evan, authors might well focus on transient programs like this one: http://www.darpa.mil/Our_Work/I2O/Programs/Big_Mechanism.aspx (which wants computers to read journal articles) because the program is the immediate funding source. Including these transients would be a huge challenge.

By David Wojick
Jan 30, 2015, 9:23 AM

As Evan points out, when we start talking about comprehensively cataloguing and normalizing information around funders, their programs and specific grants and awards – think data hierarchies, common ID structures w/ enforced uniqueness, etc. – this becomes “a major data collection and management challenge.”

To my mind this then a problem in need of an entrepreneur. I expect that a comprehensive database along these lines would have viable commercial value, and creating such databases is a well-established business model. Even better, whereas privately held commercial entities have little or no incentive to divulge information about themselves – a hurdle for Hoovers and its ilk — if funding/program/grant information is not publicly available, or is not made available in a timely manner, there’s always FOIA.

So this seems like an opportunity for a commercial interest the likes of a Ringgold. The implications of that, though, are not entirely clear. Would manuscript submission systems and platform providers now bear additional costs to license the data (costs that would then, at least in part, also be borne by publishers)? Where would CrossRef/FundRef figure in such a system? Or could CrossRef/FundRef be the entrepreneur in this scenario, making the data freely available for its members and their use – thus allowing for better data integrity at every step of the process – but perhaps commercially licensing the database to non-members in order to help offset the cost of building and maintaining it?

I don’t even know if that last bit is possible, whether it would fall afoul of CrossRef’s bylaws, but I’m trying to play this little thought experiment out a bit further and I welcome other commenters’ help with that, even if it’s to completely invalidate what I’ve thrown out here!

By Matt Hirst
Jan 30, 2015, 1:46 PM

Sounds good to me Matt. And yes, funding/program/grant information is all available if you know where to look for it in the labyrinth of Federal funding. Extracting it will not be simple and I would be happy to assist such an effort. In fact I envision a new transparency in the mega-billion dollar Federal research enterprise emerging from all this.

This data could also be of great assistance to researchers. When you have a new idea for a potentially important research project the first question is often “who is funding this sort of thing?” and that is often a hard question to answer. I have a couple of those questions on my desk right now.

By David Wojick
Jan 30, 2015, 2:09 PM

Thanks, David. Agreed that it would be great assistance to researchers, and I expect that there are a variety of groups outside of those we’ve already identified that would find such data useful. I suppose there are good reasons such a solution does not already exist — likely owing, as you say, to the difficulty of finding/extracting and then normalizing so much data — but it does seem at its heart a vey solvable problem, in need of the right combination of resources and will/interest.

By Matt Hirst
Jan 30, 2015, 10:27 PM

Ironically the Feds are going the other way. DOE OSTI (my old home) used to maintain a database of Federal projects, but it was terminated to free up money for the development of their public access system, which is unfunded. See http://www.osti.gov/RETIRED/fedrnd.html. The fact that OSTP designated Public Access as unfunded was unwise.

By David Wojick
Jan 31, 2015, 8:50 AM

Interesting thread. It sparked a (possibly left-field, not fully formed) thought re citability. For context, good practice in the data world is to include a data availability statement in the article, and include a regular citation to the data set within that statement. The dataset is discoverable, the data provider benefits from a formal citation, there are bidirectional links, all using regular publishing infrastructure.

Theoretically, if grants were citable items, could this solve the problem?

(Incidentally, I’m a big supporter of FundRef and what CrossRef is doing.)

By Richard O’Beirne (@robeirne)
Jan 28, 2015, 12:38 PM

Robert, your question/suggestion about grants being citable items would seemingly go a long way to solve that particular piece of the problem. As Kirsty notes in her initial post though, the funding agencies use different systems for assigning and tracking grant numbers and seemingly have neither the means nor the incentive for conforming to a common system. I would argue that there is a clear incentive in so far as it improves data integrity and transparency for those collecting the data and, therefore, for those using the data – e.g. the funders, general public, etc. – as well. Whether that’s compelling enough to justify funding proper means to establish a common system is much harder to say…

By Matt Hirst
Jan 30, 2015, 1:13 PM

Apologies – Richard, not Robert!

By Matt Hirst
Jan 30, 2015, 1:14 PM

Matt, I am not sure what “citable item” means but Angela’s proposal that grant numbers be standardized and include the proper funder metadata sounds like it might be part of it. See also my comment: http://scholarlykitchen.sspnet.org/2015/01/28/what-has-fundref-done-for-me-lately/#comment-150448.

The funders in this case are not terrifically excited about either access or data. The US Public Access mandate is not coming from the funding agencies, but from the President’s OSTP, so it is something of a shove down for the agencies. This may be why after two years we still have no agency programs announced, except for DOE who hopes to sell the other agencies a public access service. The agency’s job is to get money to researchers, not to collect manuscripts. The contracting offices in particular, who own the existing contract numbering systems, have little interest in and no money for such a government wide effort. It may happen but it will be a long slog and at this point no one is even pushing for it.

By David Wojick
Jan 30, 2015, 1:55 PM

Kirsty Meddings from CrossRef has just published a fuller FundRef progress report on the CrossRef blog – http://www.crossref.org/crweblog/2015/02/fundref_progress.html

By Ed Pentz (@epentz)
Feb 5, 2015, 6:56 PM

Thanks Ed, very interesting. Kristy reports a 57.5% error rate so clearly Angela’s experience is not unusual. And this does not include failure to list all funders. The proposed remedies seem to be pretty labor intensive so I am wondering if some automated funder identification might help. For example, semantic extraction of funder names from the acknowledgements, based on the taxonomy. Attaching the correct IDs would then be trivial.

Beyond reading the acknowledgements, if the system had funded project information like award numbers or descriptions it could do a lot more. Think of this as a funder data service. It would save everyone a lot of effort — authors, publishers and CrossRef, plus providing more data.

By David Wojick
Feb 6, 2015, 7:17 AM

The Scholarly Kitchen

What Has FundRef Done for Me Lately?

Angela Cochran

Discussion

Innovation Showcase Highlights Cutting-Edge Publishing Solutions

View photos from the 46th Annual Meeting!

Angela Cochran

Related Articles:

Next Article: