At a press conference on Friday last week, the U.S. Federal Bureau of Investigation (FBI) unsealed indictments of nine Iranian citizens. This sentence is an odd way to start a Scholarly Kitchen post, admittedly. What makes this case interesting to the scholarly community is what these men were indicted for: the bulk theft of intellectual property from academic institutions in a brazen scheme to gather and redistribute scholarly content. The indictments outline a multi-year effort launched in approximately 2013, by the Mabna Institute, a company based in Tehran, to assist Iranian universities and scientific and research organizations in stealing access to non-Iranian scientific resources. The indictment press release describes the alleged efforts whereby:

“…the Mabna Institute, through the activities of the defendants, targeted more than 100,000 accounts of professors around the world. They successfully compromised approximately 8,000 professor email accounts across 144 U.S.-based universities, and 176 universities located in 21 foreign countries.”

In addition, the scheme sought to capture credentials and materials from 47 U.S.-based and foreign private sector companies, the U.S. Department of Labor, the Federal Energy Regulatory Commission, the State of Hawaii, the State of Indiana, the United Nations, and the United Nations Children’s Fund. The indictment alleges a complex and architected effort targeting all domains of research, including science and technology, engineering, social sciences, medicine, and other professional fields. The defendants allegedly conducted reconnaissance of targets to determine individuals’ research interests and where they had published articles. Based on that background information, posing as colleagues from other institutions, the team sent phishing e-mails to their targets. Once compromised credentials were collected they were then used to access and copy materials, including scholarly journals, theses and dissertations, and electronic books for further distribution. Credentials were allegedly then also resold for others to access the compromised institution’s systems.

FBI Building

The scale of this effort was tremendous as outlined in a statement Friday by U.S. Deputy Attorney General Rod Rosenstein, “These nine Iranian nationals allegedly stole more than 31 terabytes of documents and data.” This amount of stolen data is roughly equivalent to the disk space necessary to hold a digitized version of the print collection of the Library of Congress (if LC were to do so). In addition, the FBI alleged that the scheme had ties to the Iranian government and Iranian universities. Rosenstein continued, “For many of these intrusions, the defendants acted at the behest of the Iranian government and, specifically, the Iranian Revolutionary Guard Corps.” This allegation is quite significant and, if true, would considerably raise the stakes in the cat-and-mouse game of using credentials to illicitly capture and republish scholarly content.

Although not named in this case, there have been claims and questions whether Sci-Hub has been engaged in similar phishing and credential theft activities. Alexandra Elbakyan, founder of Sci-Hub, denies being involved in these sorts of fraudulent activities. The case brought by the FBI is the clearest legal indication that these types of attacks on the academy for the purpose of acquiring intellectual property are happening regularly and at a massive scale, some visible and some less so.

As it happened, just hours before this announcement was made, I engaged in a Twitter discussion with some open access advocates who continue to believe that Sci-Hub and related services are functioning on the basis of “donated credentials.” In an email interview published by Mike Taylor, a paleontologist at the University of Bristol, Elbakyan is quoted as saying that the use of phished credentials to add content to the Sci-Hub system, “is possible, because Sci-Hub acquires passwords from many different sources,” not just “donations.” I contended via Twitter that the scale and regularity at which compromised credentials are identified by publishers and libraries indicate that this activity is far more systemic and pervasive than a few hundred credential “donations.” And now, it is formally alleged that there is an active and aggressive, government-sponsored effort to drive this effort of scooping up login information.

Could there be a link between these Iranians and Sci-Hub? Again, such a formalized link isn’t clear, but the indictment states that one of the sites run by the defendants, Gigabpaper.ir, “sold a service to customers within Iran whereby purchasing customers could use compromised university professor accounts to directly access the online library systems of particular U.S.-based and foreign universities.” From my perspective, to presume a connection between Sci-Hub and the Mabna Institute would be pure speculation, but one only needs to glance at the available usage levels of Sci-Hub, to note that Iran-based access is a significant source of the service’s usage. (These usage levels are based on the data that was provided to John Bohannon for his 2016 article in Science.) There is no reason to imagine that the Iranians would be the sole source of compromised credentials used by Sci-Hub, but if one considers the multi-year scale of the Iranian effort described in the indictment, with allegedly significant financial support from the Iranian government, it seems odd that a “poor Ph.D. student” like Elbakyan would be able to replicate a similar enterprise without similar resources.

Reading closely Deputy A.G. Rosenstein’s statement that, “this case is important because it will disrupt the defendants’ hacking operations and deter similar crimes,” it would seem odd to target two relatively small redistribution sites in Iran and pass over the prime offender in the marketplace. On the other hand, it might have been easier to prove the direct connections outlined in this indictment than it might be with others. Of course, we don’t know what else the FBI might be working on, and whether further sealed indictments might exist or be forthcoming. Perhaps the FBI is more aggressively pursuing the source of the compromised credentials and not the distribution of the content once it has been acquired. As Elbakyan stated in the Science article, “I cannot confirm the exact source of the credentials, but can confirm that I did not send any phishing emails myself.”  If the Iranians had developed the infrastructure to gather these credentials, why wouldn’t Sci-Hub just use what was available, rather than build it themselves?

While I am certain that some community members are foolish enough to “donate” credentials, it is now clear the scale and scope of the breaches of academic systems to illegally aggregate intellectual property go well beyond a few hundred zealots or simpletons willing to blithely throw their campus’ online security to the winds. The intellectual property being sought is worth hundreds of millions — if not billions — of dollars. And the intellectual property stolen is not just papers and books, but it might be someone’s next paper, scientific discovery, or corporate research. Also importantly, the theft of these credentials is significant because they provide access not only to library resources, but also administrative systems, email systems, and other valuable research resources containing private information.

One of the earliest developers of the Internet, Vint Cerf, has commented that he considers one of the biggest missteps of the formulation of the Internet to be not designing or building security more deeply into the infrastructure at the time. It was a non-trivial addition to an already complicated endeavor, so it’s not particularly surprising that it wasn’t addressed. In a 2015 Washington Post article, David Clark, an MIT computer scientist and another internet pioneer, was quoted, “It’s not that we didn’t think about security. We knew that there were untrustworthy people out there, and we thought we could exclude them.” This point is nothing new, especially for Clark and those that have been deeply involved in web technologies since the development of network computer systems. Clark was interviewed for an article some 9 years earlier, in 2006, about the problems of internet security, where he reflected even further back on a 1992 presentation that covered the topic of the lack of embedded security in Internet protocols. What is interesting and troubling is how some of those “untrustworthy people” now actually have government support for their nefarious activities.

The relatively lax information security surrounding access to subscribed resources is one of the reasons behind the push toward the Resource Access in the 21st Century project #RA21, led by NISO and the STM Association. The publishing community has done woefully little over the past 25 years to innovate and improve the access control systems in place to provide users with easy access to subscribed content, particularly as they have become more mobile. Would RA21 help prevent the types of phishing schemes that are at the core of this case? Possibly not, but its solutions, when adopted, would certainly limit the potential damage from a single compromised credential by being able to target the source of that compromised credential more quickly and more precisely. Also, tying content access to the credentials that patrons have and use regularly for access to a variety of other systems makes a lot of sense to raise awareness of the need for security, and to make it more routine for users to authenticate to get access to materials.  This process need not be cumbersome, as anyone who authenticates daily for access to Facebook, Twitter, Google, or Amazon can attest. What RA21 seeks to achieve is to make the individual login experience for subscribed content similar to those that we all use daily without thinking twice about it, ideally in a more privacy-protecting environment than those other services.

Postscript: In further reporting on this story by Paul Baskin in the Chronicle of Higher Education, it was reported that there is very muted, if any, response from formal channels in the university administration community, such as the Association of American Universities. This is astoundingly troubling. One can not overlook the political landscape and the geopolitical implications of any federal actions against Iran or Iranian citizens, especially from this administration seeking to tank the nuclear deal with Iran, but this seems to me beside the point. The targets and sources of this focused attack on the information systems of universities being Iranian is less important than the fact of the attack, the purpose and the goal of using compromised university systems. The nature of these attacks mean that universities must significantly improve their information security protocols, because the attempts to break into their systems are not simply some kid in his basement, or foolish professors offering up their credentials for a vision of providing free access to publisher content. This is a serious attack seeking to gain access to valuable resources. Universities have a contractual responsibility to maintain the security of intellectual property sold to their patrons. It is clear that universities, libraries and publishers need to step up their information security game in light of the fact that their adversaries are far more sophisticated than had been originally thought.
Correction: John Bohannon’s 2016 article, “Who’s downloading pirated papers? Everyone” appeared in Science, not in the Chronicle of Higher Education as originally posted.  The article was corrected to fix the error.
Todd A Carpenter

Todd A Carpenter

Todd Carpenter is Executive Director of the National Information Standards Organization (NISO). He additionally serves in a variety of leadership roles of a variety of organizations, including the ISO Technical Subcommittee on Identification & Description (ISO TC46/SC9), the Linked Content Coalition, and the Foundation of the Baltimore County Public Library.

View All Posts by Todd A Carpenter

Discussion

18 Thoughts on "FBI Indicts Nine Iranians in a Massive Scheme to Target Academic Credentials and Steal Content"

Todd, thanks for pulling this all together. I had heard that services in Iran were selling Iranian universities “subscriptions” to sci-hub. The university gets the corpus of literature for around $20k. The reseller would dress it up with some a spiffy web site overlay. All the money stays in Iran. So, your premise is totally valid and also lines up with what we have seen of usage of Sci-hub in Iran.

These resellers certainly have a financial motivation to access content illegally.

Do you know if Iranian universities were able to gain legal access to those subscriptions?

Yes, they could subscribe. If memory serves, several years ago, the universities ordered subscriptions and then the government took all the money.

Do you have any source confirming that?

I ask because 2013 — when Mabna Institute allegedly started ops — was also the year when many Iranian researchers were forbidden publishing in EU- and US-based publishers’ journals due to economic sanctions.

I’m fairly certain those sanctions prohibited publishers from doing business with Iranian organizations in any other manner, too, so I find it hard to believe subs were allowed.

Wasn’t Bohannon’s story on SciHub for Science, not the Chronicle of Higher Ed?

Stewart and Michelle, thank you both for correcting the error. Yes, Bohannon’s article was in Science not in the Chronicle. I’ve corrected the error in the text and noted a correction in the article itself.

Thanks Todd, great post, important comments about STM’s RA21 initiative too. I think the citation is: Bohannon, J. (2016, April 28). Who’s downloading pirated papers? Everyone. Science, 352(6285), 508-512. doi:10.1126/science.352.6285.508
Corrected 13 May 2016.

Before we get too excited about the “potential” of RA21 approaches here, perhaps it would be good to find out if the compromised credentials that were phished were used with IP+proxy or … in a Shib/SAML/OpenAthens set up like RA21 advocates? The assumption that the compromised credentials were used with an IP+proxy is not substantiated by any report I’ve seen on this. Please direct me though if I’ve missed something?

you are correct, it is not with an IP address, it is in a “Shib/SAML/OpenAthens set up”

This is an outstanding post, Todd. Now, would it be too much to ask SPARC to offer their point of view?

Thanks Todd, great article and glad this is getting more publicity. As you know, PSI has been gathering a huge amount of information on Sci-Hub and Iranian sites over the last year. We have been working with the STM, many of our publisher partners, libraries, consortia and members of the Standards Industry. The theft of credentials is at the heart of most of what we have been investigating, and PSI has evidence from more than 1000 intrusions into library systems via a variety of remote access proxy servers. Whether it be Shibboleth credentials or other library credentials, any system that uses usernames and passwords is vulnerable. RA21 will not solve the security issue by making the University Credentials a single sign on; it could in fact make it worse, as stealing one set of credentials that gives access to everything could be more harmful. Even libraries and publishers that have used dual factor authentication are not safe; Sci-Hub, for example, found a way around CAPTCHA and have also hacked into personal accounts in order to replace mobile phone numbers with their own. As a result, SMS messages with the secondary authentication went to hackers and not the real owner of the credentials. Also, when using dual factor authentication, access becomes more difficult and potentially drives more users to the likes of Sci-Hub.

We have seen instances where Sci-Hub bombards Universities with Dictionary Attacks and Phishing Attempts for days, before using the stolen credentials to access the library and remove as many as 50,000 PDFs. We have also had reports from University IT departments that see stolen credentials used to plant viruses on computers which have to be completely wiped to remove the viruses. Why anyone would go to the Sci-Hub site and download information, knowing it is stolen and also knowing how it got there in the first place is beyond me. Why Universities do not block their patrons from going to these sites is also disturbing when newspapers are full of articles about just how much of the UK and other countries research is being stolen on a daily basis; https://www.thetimes.co.uk/article/university-secrets-are-stolen-by-cybergangs-oxford-warwick-and-university-college-london-r0zsmf56z. Once credentials are stolen, they are passed around and probably sold and we see them on many sites including several Iranian sites. The same credentials could also be used for obtaining other information that could lead to identity theft. I have evidence that most credentials are stolen not donated. From the research PSI has gathered and shared with many publishers, Publisher Associations and authorities, the evidence shows that Sci-Hub is indeed involved in phishing and stealing credentials and that it is far too large and organised to be what Alexandra would have us all believe it is.

We also have experience that backs up your view, Todd, that there is a muted response from formal channels in the University and administration community. Even after sharing all the evidence with JANET and the UK authorities I cannot even get them to be involved in the project we are running to help Universities protect themselves. I was advised by the National Cyber Security Centre that they were unable to help us because “We have had no reports from Universities of any breach” and “Unfortunately evidence of intrusion to satisfy a court of law has to come from the device intruded upon. We will reach out, through JANET, to the two specific institutes you have provided to see if they wish to make formal complaints themselves. Unfortunately we can only act of what the actual victims tell us and although I understand your frustrations we have to have the victims come forward themselves. If they have had data breaches or attempts to do so they should report this to Action Fraud and law enforcement will investigate.” I also went to JISC with the evidence and asked them to help me spread the new project to all their members so they can try and block Sci-Hub from attacking their HEIs and was told “I have discussed your proposal with a number of senior colleagues in JISC and all things considered we think we are not in a position to support taking this forward on behalf of UK HEIs.” We are engaged with many consortia who are taking this seriously, like VIVA, CAUL/CEIRC and SUPC, but it still remains that only a handful of the affected Consortia and libraries are engaging with the project. PSI now has a live system for publishers to share information and block attacks. The system, IP-Intrusion.org, enables us to share the means to block attacks directly with Universities. IEEE have been instrumental in helping us set this up, so a special mention must go to them. If anyone wants to know more about this, please contact me on andrew@publishersolutionsint.com

Andrew, has PSI published any reports of the analysis you have done? I’m intrigued that you don’t see RA21 as the solution given how often I read people saying it is!

Hi Lisa, let’s arrange a call and I will share the evidence PSI has collected with you.
For RA21, I think they are doing a great job looking at “remote access”. This is where the problem lies and not “on campus”. The reason the Russians and Iranians are not hacking IP address validation via on campus access is because they actually have to be there. I have been advised that over 80% of usage comes from on campus access and I am not quite sure why RA21 is so adamant that it has to change IP address validation by making all access via username and password. Clearly I have a vested interest here, as IP addresses are fundamental for theIPregistry.org, but IP address validation is working effectively, and we have circa 60,000 libraries in the IP Registry successfully using IP validation. I would like to see greater security being a big part of any new solution but not at the expense of ease of access. Dual factor and CAPTCHA both add barriers to access and that is part of the problem that needs to be solved. When hackers can steal passwords and go into users’ profiles and change their cell phone numbers etc., they are already finding a way around dual factor, as Sci-Hub has with CAPTCHA. The stealing of credentials is huge business, and credentials are a main target for hackers, as the following article demonstrates https://hbr.org/2017/12/you-cant-secure-100-of-your-data-100-of-the-time. Therefore, solutions without usernames and passwords would be optimal; alternatively, in addition to usernames and passwords there should be another seamless verification method. In the latter case, monitoring stolen credentials and avoiding their theft in the first place are going to be crucial.
I have offered to work with RA21 and link theIPregistry.org to the pilots so they can work in parallel with IP validation until a better solution is found, but it seems, as Todd re-iterated in his reply, that RA21 just want to do away with IP’s and so are not willing to work with us. Our IP-Intrusion project could also help with security issues, and we are investigating with a major remote access provider how we link the two together.

Someone asked if this forum get could as comment from SPARC.
The note below was sent to them via their web contact form.
One wonders if the group has many ‘enablers’ or not….
– Bill
_____________________________________________________________
Could anyone at SPARC make some comment or response
to today’s post on “The Scholarly Kitchen” (link at bottom)?

This involved Sci-Hut and today’s news of an FBI indictment of
Iranian universities in stealing log-in credentials in numerous USA
academic libraries.

Putting these complexities in context might be very
helpful for everyone.

The headline includes the words “FBI Indicts…”
The FBI cannot indict anyone. Only a grand jury can do that.

When taxpayers who fund research are locked out of accessing its results by dizzyingly high subscription fees, I can understand why they think the people who charge the fees are the real thieves.

1) Not all research is funded.
2) Not all funded research is funded by taxpayers.
3) Not all funded research includes funds to pay for publication of that research.
4) It is unlikely that the taxpayers of Iran paid for the research that was stolen here.
5) It continues to confuse that so many are so focused on access to the stories written about research results rather than the research results themselves, which researchers and universities, with the encouragement of funding agencies, lock up behind patent paywalls that are vastly more expensive than journal subscriptions:
https://scholarlykitchen.sspnet.org/2013/08/06/is-access-to-the-research-paper-the-same-thing-as-access-to-the-research-results/

Thanks for approving my comment! It’s an important debate, and David, your points are valid against my over-generalisation there. I’m uncertain of the exact proportion of taxpayer funds for research and editorial boards, but I’ll warrant it’s high. 60-90%? Please enlighten me.

The cost of electronic publication is ridiculously cheap, I don’t see a strong argument on that front.

Surely *some* of the papers that were reproduced were funded by the Iranian taxpayer? Either way, it’s a shakey argument. If they did fund the research, then they’ve a right to a copy. If not, then they are clearly in most need of the scholarship while their GDP indicates they are amongst those who can least afford it. I believe that where inequality exists, the greatest benefit should go to the least advantaged.

Thanks also for the link to your post about STEM results being kept behind patent paywalls; well argued, and I agree an issue that also requires addressing.

Comments are closed.