We may live in the age of privacy nihilism but recognizing one’s reality does not have to mean agreeing to do your own work by its terms. This post is for those publishers, academic and research librarians, and others who conduct research on user behavior in library information systems, who — whether for personal and/or professional ethical reasons or policies — want to do so in ways that prioritize privacy.
Situating Myself and Academic Librarianship
A bit of my own background is probably useful to contextualize this discussion. My own attention to this topic of privacy and user data came into focus when I led the launch of the Value of Academic Libraries Initiative as President of the Association of College and Research Libraries (ACRL) in 2010-2011. Grounded in The Value of Academic Libraries: A Comprehensive Research Review and Report, my work that year and since then has been heavily focused on advocating for the profession to move to evidence-based claims for library value and for the collection and analysis of individual user data in order to do so. This work has been heavily criticized for its focus on collecting user data and, at times, for facilitating the neoliberal transformation of higher education.
Given that, I have also had to confront hard questions about how gathering and analyzing user data aligns with the values of my profession. Specifically, the value of privacy as expressed in the ALA Code of Ethics statement that: “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.” These questions have not had easy or straightforward answers, particularly as the value of privacy can be in tension with another principle in the ALA Code of Ethics: “We provide the highest level of service to all library users.” I’m grateful to Andrew Asher who joined me in a series of public presentations exploring these issues (e.g., CNI Fall 2014).
Serving as a member of the the NISO Privacy Principles working group and training librarians in the ACRL Assessment in Action program and the ACRL Standards for Libraries in Higher Education roadshow have provided continued opportunities to reflect deeply about the challenges librarians are facing in this realm. Recently, I was part of the national convening for Library Values & Privacy in Our National Digital Strategies.
All of this is to say that I have spent enormous amounts of time and energy engaging with the library and publishing communities around these topics. Librarians — myself included — have been and continue to be deeply engaged with a struggle to reconcile theory and practice, particularly as their values often put them at odds with both their own institutions as well as dominant commercial players upon which they rely to provide information services to library users.
So, can you prioritize privacy in user research? Simply put — yes. Will it be cost-free to your project or your organization to do so? Simply put — no. We have to accept some limitations on our user research and its potential applications in order to prioritize privacy. We also have to accept some limitations on privacy in order to conduct user research.
The questions at hand are how the two will be negotiated against each other, what promises are made to research participants, and how can we ensure those promises are kept throughout the stages of data collection, analysis, reporting, preservation, etc.
Before we go further, I want to untangle three terms that I find are often confused and conflated in discussions of privacy and user data: privacy, confidentiality, and anonymity. I think it is noteworthy that the ALA Code of Ethics uses both of the terms — privacy and confidentiality — in its statement. This is already a tip-off that the two are not interchangeable. ALA provides an explainer that defines the two terms: “In a library, the right to privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others. Confidentiality exists when a library is in possession of personally identifiable information about users and keeps that information private on their behalf. Confidentiality is a library’s responsibility.”
As one way of paraphrasing this, confidentiality is a mechanism for a library to have and use data while protecting privacy.
One might wonder why the mechanism is not anonymity? After all, wouldn’t that be the most privacy-protecting approach of all if a user is not and cannot be identified? Indeed, it would be. To collect no data about users at any time would be the most privacy-protecting approach. It is also not possible to manage a library effectively, which is a community good, if you do not collect any user data. For example, tracking who has which book checked out currently is fundamental to stewarding the collection. Monitoring how many hours a user has reserved media equipment in a given week is fundamental to stewarding access to limited resources. For rare books/special collections, best practices are to create a permanent record of who uses what and for what purpose. Given all this, ALA advises that “librarians should limit the degree to which personally identifiable information is monitored, collected, disclosed, and distributed.”
From this, we can derive a first principle of Limitation – collect only what is necessary. Without a doubt we can debate what is necessary but we have at least moved out of the realm of assuming that one should just collect data regardless. I’ll make an additional note that this is an affirmative statement of contemporary judgement — it isn’t sufficient that such data could be judged necessary in the future but rather that it is judged to be so now.
A second principle that we can derive from the above discussion is Protection — prevent examination or scrutiny by others. A library (or other organization) that has data that it has determined necessary to collect and use should not share that data with others who would examine or scrutinize it. This means that the data is secured and managed by the library and not transferred to others. As a side note, it may be tempting to think that the data can be anonymized before sharing and thus not create issues for privacy or confidentiality but one should be very careful about assuming this. Re-identification (or de-anonymization) of data has turned out to be easier and more successful than many might think it could be.
A corollary to the principle of Protection is the principle of User Control — an individual may choose to share or to not share data about themselves. This decision belongs to the user and not to the library and the user being able to exercise full agency in this decision means fully informed consent with regard to what is collected and how it will be used and managed. A library may ask permission to collect or share data but must be careful to do so in a way that does not pressure the individual to do so. In spite of this principle, we must recognize that it is common that data is collected as a condition of using a third-party tool or service in libraries; in such cases transparent disclosure about data practices is critical though we should recognize that the individual’s choice is constrained.
Applying Principles to User Research
I’ve situated this discussion in the ethics of librarianship because, regardless of who is conducting the research per se, research in library information systems is research in libraries and the ALA Code of Ethics applies not just to librarians (individuals) but to libraries (institutions). However, as we move to focus on the application of these principles to user research, I think we can also benefit from drawing upon the principles of Respect for Persons, Beneficence, and Justice, which are the underlying principles of human subjects research review in the United States as codified in the Common Rule and most commonly encountered in Institutional Review Board (IRB) processes that require their application to informed consent, assessment of risks and benefits, and selection of subjects.
Each of these the principles is explained in the Belmont Report. Respect for persons means that individuals should be treated as autonomous agents and that persons with diminished autonomy are entitled to protection. Beneficence means treating persons in an ethical manner by respecting their decisions and protecting them from harm and by making efforts to secure their well-being, including an evaluation of risk against benefit. Justice means that benefits and burdens of research are distributed fairly.
In addition to applying these two sets of principles generally, I highlight the following when coaching others on privacy and data in the context of user research:
- Presume you need permission from users in order to collect and use data about their information behaviors and to share it with others. Insist on disclosure of data practices in cases in which data is collected as a condition of using a service or tool.
- Carefully specify the user data that will be collected and how it will be managed securely throughout the processes of collection, storage, analysis, reporting, and preservation so that the practices are sufficiently detailed as to be followed by anyone who might have access to the data. It is not enough that you as the creator of the dataset can understand the procedures. They must be documented for others. (Note: Most IRB processes place heavy emphasis on how user participation is solicited and how data is collected and stored. Other stages in the process typically receive little to no examination. User research is particularly vulnerable to privacy breaches because reporting detailed analysis can result in sharing findings for small n groups such that identification of individuals is possible from the analysis output.)
- Don’t confuse anonymity, confidentiality, and privacy. In particular, be very careful that you do not promise research subjects anonymity when your data practices only support confidentiality and be certain to communicate any limitations on the promise (e.g., a court order to disclose). Confidentiality is a mechanism for prioritizing privacy. But, confidentiality is not anonymity.
- If you ever find yourself saying “but, if I tell them that, then users won’t agree to participate in my research,” take that as your conscience waving a red flag at you. It is probably a sign that you need to tell your users exactly whatever you are thinking you would rather not.
A Final Note
I’m not naïve. In reality, these are strategies for prioritizing privacy; they do not guarantee privacy. The privacy nihilism article cited in the first paragraph felt very real to me when I read it. Many library users are simultaneously logged into Google and Facebook services when they are using library resources. Data breaches are all too common and even the most careful individual can make a mistake. But, for all that, I’m not willing to give up on my ethics just yet.
I also don’t want to suggest that I’m offering the final and definite take on these issues. These discussions are complex and evolving, just as the technologies and environments in which user data tracking occurs are. I’m looking forward to diving in more deeply to questions about privacy and web analytics at the upcoming National Web Privacy Forum and also invite anyone who is interested to join me as a participant in the Digital Library Federation’s Technologies of Surveillance Working Group.
7 Thoughts on "Privacy in User Research: Can You?"
In the EU, the General Data Protection Regulation covers all of what Lisa rightly recommends built into law, and more (e.g., the right of people to inspect what is held about them, their right to insist on rectification of inaccurate data held about them, their right to sue for breaches of the law, and their right under some circumstances to have data about them erased). A pity the USA hasn’t adopted a similar law.
GDPR does indeed cover much of this. It remains to be seen though if there will be adequate monitoring or enforcement. It seems mixed at best. I’d also note that, unless I’ve missed it, GDPR does not insist on beneficience or justice as a underlying consideration re what one is allowed to ask people to consent to. I haven’t seen that anything is prohibited – just that the user must consent. I’m not sure how it protects people of diminished autonomy either.
Regardless, I find value in articulating ethical principles even if it turns out they are already incorporated (in part) in some laws (side note – especially since the world is bigger than the US and EU). At a minimum, such principles can help to evaluate a law. They can also keep us ethical if the law changes/isn’t enforced.
GDPR must be followed by many many non-EU entities, including many US and other publishers with European customers/users if I understand correctly. So, many of its benefits (and costs) are applied well beyond EU citizens.
Certainly that is what GDPR asserts. Whether businesses are being monitored or GDPR is being/can be enforced outside the EU is still an open question. It it is definitely an open question if GDPR compliance is being monitored with respect to on-the-ground UX research by, for example, academic librarians in the EU.
I’m not anti-GDPR – I’d love to see it be effective … I just think ethics matter whether they are enshrined in legal requirements or not.
Agreed. I was responding to Charles’s “pity” — threading wasn’t working for me.
Specifically with regard to data use and privacy for user research. The Market Research Society in the UK, has a code of conduct (https://www.mrs.org.uk/standards/code_of_conduct) which all research practitioners in the UK should sign up to. ESOMAR (The representative body for the market research profession globally) and the American Market Research Society all have similar codes of conduct. These concentrate on respondent rights and treatment and are a good basis for how you should recruit and manage respondents and report your findings for both quantitative and qualitative projects. These organisations are really good sources of information of how to manage respondent privacy, anonymity and confidentiality as well, so if you are ever stuck they are worth contacting.