Sometimes it is important to read a book despite suspecting that you will disagree with it. This was the case with “Our bodies, our data“, by Adam Tanner. First of all, by way of disclosure, I have worked for years developing and using data resources involving fully de-identified information extracted from electronic health records. I see the responsible, legally compliant, use of this data as an important and unique research asset and am very concerned that a push to require patient assent for their de-identified data to be included in these resources could impair potential life saving, health enhancing and policy improving research.
Second, I am a strong believer in and proponent of patient privacy. I believe that you can have both appropriate research using large volumes of medical data and privacy.
The subtitle for this book is “how companies make billions selling our medical records”. The premise is that there are businesses selling medical data, without the approval of patients. The tone is sinister, with chapter titles like “Data bonanzas for pharmacies and middle men” and “The Covert alliance”. The data of concern includes prescriptions, medical claims and data derived from electronic health records. Tanner provides an interesting and informative history of companies like IMS. He describes how IMS assembles data about prescriptions and then provide that data to pharmaceutical detailers who use the information in their sales conversations with physicians.
Where I disagree with this book is the use of inflammatory labels such as “the trafficking of data” and a persistent tone of suspicion. In my experience, there are many positive and beneficial uses of appropriately de-identified medical data. Research using this data has yielded important findings that would not be possible with traditional, expensive, randomized clinical trials. For example, I recently contributed to a paper in the Journal of the American College of Cardiology in which we examined serum magnesium levels for patients who had experienced a heart attack. In an era in which biomedical research funding is consistently in the crosshairs, this type of research plays an important role in generating societal value from data generated by the healthcare process. Furthermore, compared with other epidemiological methods which involve surveying patients and providers, the retrospective analysis of de-identified healthcare data provides important information about healthcare as it is practiced in the “real world”, not how we want it to be. This can provide important insights into health disparities, patterns of care that lead to the best outcomes and other information that may not always be reported accurately in surveys. Any requirement that patients provide individual approval for the inclusion of their data in these resources would add a significant bias to the data and would impair the ability to accurately recognize disparities.
The book acknowledges valid concerns around sensitive health data, especially mental health reports. It was the risk of mental health data being exposed that triggered Deborah Peel, a psychologist and vocal privacy advocate, to launch her crusade against most sharing of clinical information. Tanner represents Peel at the extreme end of the privacy spectrum and quotes David McCallie, a friend and mentor at Cerner. He acknowledges his respect for Peel but also questions her objectivity.
The book also addresses concerns about whether or not de-identified data is truly untraceable. Indeed, one type of data, genomic information, can be re-identified and the risk of this should be acknowledged in consent forms.
I encourage you to read this book and form your own opinions. Keep the following considerations in mind:
- What is the cost of impeding research that poses no individual risk? Most public health and health policy research uses surveys of small cohorts. The responses to surveys are not always accurate. “Real world” data provides unique and important insights.
- Which poses a more significant risk – the theft of identified data through system intrusions or the appropriate release of de-identified data? It is important that authors be rigorous about differentiating the two rather than blurring them. The release of identified medical data poses a major risk but is an entirely separate issue.
- The most sensitive information, for example, mental health reports, are not typically included in research data sets because text documents cannot be reliably de-identified. Likewise, genome analyses are not distributed in these data sets.It is important that authors understand the scope of what is and is not included.
- How many of the valid concerns articulate in “Our bodies, our data” can be address through more effective and visible enforcement of HIPAA, conflict of interest policies and GINA?