Back in 1963, Charles Schulz drew a prescient character, who was introduced in the strip below:

Back then, with computers edging into mainstream lives, numbers were replacing names as main ways we identified ourselves. Instead of fighting this, 5’s parents gave in. His sisters were 3 and 4, “Nice feminine names,” as Charlie Brown once noted.

Fast-forward 50 years, and now any one of us can be quickly sequenced into our basic genetic code, which can then be stored as numbers in a database somewhere. Or, a few cells can be preserved for later exploration, again reducing us to data. And, combined with the vast networks, storage facilities, and computer crunching capabilities, we and the data that can be derived from us can mix and mingle in ways we’ve only started to contemplate in the era being dubbed “Big Data.”

Informed consent dates from the Nuremberg Trials, when Nazi atrocities led Western medicine and law to establish rules protecting patients from abuse by authorities. Yet, problems continued, with UK citizens finding out in 2004 that physicians had been storing tissues — many from children and infants — without first obtaining patients or parental consent. In another case in the US, blood gathered for one purpose was also studied for another purpose, one the participants never agreed to give blood for.

Samples are difficult to handle and transport. But data? They are very easy to handle and transport, but can be even more revealing and damaging, especially in the era of “Big Data,” when separate data-gathering initiatives can accidentally knit together, revealing secrets nobody agreed to share.

A recent article in Nature News covered emerging concerns around Big Data, informed consent, and medical testing. While the problems are abundant and worrisome, the answers are not at all clear.

The problems start with how the data are gathered. Are they opt-in? Or are they opt-out? The BioVu database at Vanderbilt Medical Center forces patients to opt-out of inclusion in the biomedical database being assembled there. Some worry this takes advantage of people when they are ill and potentially desperate, when they might worry that not complying could put them at a disadvantage.

Then, how the data are stored — who has access to what, and what controls are in place to prevent intentional or unintentional misappropriation — becomes a real issue. The BioVu example has plenty of safeguards — genetic and patient data are stored in separate databases; patient records are scrambled and discarded at random; and sample collection dates are sometimes altered. Even all these precautions only generate a level of protection that makes it “difficult” to match records to particular patients.

At a medical meeting I attended last week, a patient with a remarkable story talked about being invited to a party hosted by his physician, who also conducted research in the same area. Quickly, the patient began to feel eyes slowly turning to him and whispers through the room. Suddenly, someone burst out with, “Hey, everybody, this is Patients 219!” Applause followed, but confidentiality was broken. And, the patient had no opportunity to consent to having his identity revealed in public, despite the fact that spouses and dates were likely in the room.

There are two sad aspects to the eager drive to “Big Data” and the answer we seem to be arriving at for dealing with the consequences. First, there is the unbridled and almost reckless enthusiasm to push people into databases, consequences be damned. Second, the answer to these consequences seems to be an abdication — make everything “transparent” to the patient, and let them control their data.

So, as “Big Data” emerges from “Big Brother,” we end up working for them both?

Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.


2 Thoughts on "Informed Consent and Big Data — With Great Power Comes Great Responsibility"

The issue is not big data so much as the end of privacy. Physicians and health care providers have managed sensitive information for centuries. That the medical record is now encoded in gigabytes of digital data does not qualitatively change the need to maintain confidentiality. What is changing is the loss of privacy for both patients and providers; the ability to operate free from the view of others and to be free from unwelcome intrusion. Our cell phone track our every move, literally; browsers and search engines record what we see; employers routinely track keystrokes; banks monitor and sell consumer purchase data, drug companies track physician prescriptions, and increasingly the compute power is there to assemble these disparate observations into a coherent view of the person. The real problem is not big data in the medical record. The real problem is the increasing accuracy and intrusiveness of all of the inferred medical records floating around cyberspace. Does Google know which of its many users have diabetes, heart disease or cancer? If you believe the answer is “no, not with any certainty”, you are only fooling yourself. Is Google subject to HIPAA? No. Could the medical information inferred by Google be used in ways that are harmful to patients? Absolutely. The big question is not “Is this happening now?” but “Do we and our patients have any way to know if this is happening now? Or if it is, to stop it?”

Patient 219 is entirely different matter. When a physician invites a patient to a social gathering, many lines are being crossed. The physician patient relationship is not a relationship among equals. There is no way that the patient could give uncoerced and informed consent to the consequences of being present at a social event.

These are compelling issues and important concerns, but I have sat at too many tables where they have been discussed to believe that informed consent (as it is understood in biomedical research) is an at all adequate solution for large-scale internet social systems from which “big data” is harvested. People do not read end-use license agreements; if they are not made to sit alone in a room and consult with a researcher on the tangible negatives of an experiment in which they are about to participate, they will not take sufficient pause to opt-out of Twitter’s data collection.

The best we can do, as far as I’m concerned, is with increased awareness of how this data is used. One of the better examples of this I’ve seen is the dating site OKCupid’s (sadly now defunct) data analysis blog, where they pass on to their community insights such as “those of you who claim to like the taste of beer are more likely to be open-minded sexually.” While I recognize that the relative salaciousness of these findings automatically elevate them beyond Twitter’s terms of use for many readers, they do provide an excellent example of how this data is actually studied — in massive aggregate — and allow users to reflect on whether they want to be included in these particular information economies.

Comments are closed.