Jan 20, 2025

Health Data Trafficking

I’m writing this newsletter on Martin Luther King Jr. Day 2025, which also happens to be the day of President Trump’s second inauguration. On one hand, writing is a useful distraction from today’s political news. On the other hand, my topic (data privacy) has huge political overtones in today’s AI era. (Over half of the "famous people" on the Wall Street Journal’s list of notable people attending today's inauguration were Big Tech CEOs, most of whom are greedily consuming as much of your private data as they can get their hands on.)

The specific trigger for today’s topic was a recent podcast interview in which two academic colleagues of mine interviewed a sales rep from Clarivate, one of the many data brokers that buy and sell Americans' personal health data. Now, Clarivate doesn’t call itself a broker on their website. Instead they position themselves as an analytics company: “We connect people and organizations to the intelligence they can trust to transform their perspective, their work and our world.” Other health data brokers use similar marketing language. Truveta says that they “Accelerate adoption of new therapies and advance patient care.” And IMS Health, the grandaddy of health data brokers (founded by none other than Arthur Sackler of the infamous Sackler family) says that they’re “Powering healthcare with collected intelligence.” (Note that IMS Health has since been rebranded as IQVIA following their merger with the Quintiles clinical trial company.)

Historically, health data brokers have been happy to stay out of the spotlight. (How many Americans have ever heard of IQVIA/IMS Heath, even though it’s bought and sold data on pretty much anyone using the US healthcare system over the past half century?) So I’m thrilled that Mike Astion and Geoff Baird invited an industry representative onto their podcast to shed some light on the industry’s practices. Mike and Geoff are both professors of pathology at the University of Washington; Geoff happens to be the department chair. Credit also to Bridget Wegner, Clarivate’s director of clinical laboratory data partnerships, for her willingness to be interviewed.

So how do healthcare data brokers operate? Who are their main customers and what are they using our data for? And how do they manage privacy risks? Here’s what I gathered from the interview:

Essentially, Clarivate purchases de-identified medical records from hospitals, then cross-links those records with claims data and other data sets in order to create bespoke analytics to sell to pharmaceutical and other life sciences companies.

The Clarivate rep tried hard to create the impression that the purpose of all this is life-saving research. But when pressed, she conceded that the main use case is actually to support pharma sales reps. For example, suppose you’re a pharmaceutical company with a high-priced therapy for a particular rare disease. You already have an extensive database of US physicians which you purchased from the American Medical Association, as well as detailed prescribing data on all those physicians which you purchased from IQVIA, who in turn purchased it from pharmacies. But what you really want to know is which doctors have patients who are potentially eligible for your drug, but aren’t receiving it, because those are the docs you want your sales reps to call on (“detail”). And that's what Clarivate provides.

What about privacy protections for the patients whose data is being sold? The Clarivate rep indicated that her company complies scrupulously with existing privacy laws, and I don’t doubt her. But that isn’t the same as respecting the ethical dimensions of privacy.

They do not have any mechanisms for patient consent, either opt-in or opt-out. In other words, patients don’t give consent for their data to be sold to pharma, and there’s also no way for patients to request after the fact that their data be deleted. Now, part of the problem here is that US hospitals and health systems haven’t created systems on their end to support informed consent for data sharing. Geoff Baird acknowledges this in the podcast with respect to his own institution, and this paper on hospital data sharing practices that I co-authored last year with a number of data privacy experts indicates that the lack of informed consent at the University of Washington is the norm, not the exception.
The reason why consent is not legally required under US HIPAA law is that all of the data is de-identified by removing certain data fields. Now, de-identification sounds like something that would protect privacy. The Clarivate rep certainly played off of this perception. She used the word “de-identified” repeatedly throughout the interview in order to reassure listeners that their business model respects privacy. But does de-identification really solve the problem? The rep also described how de-identified data sets are combined with other data, including claims data, in order to track patients across their entire clinical journeys, and to linked them to identified physicians (e.g. for followup by pharma reps). In other words, they are re-identifying the data in a practical sense, even if Clarivate and their customers never actually see the names, birthdates, or social security numbers of the patients.

How about the justice principle in medical ethics? Geoff Baird raises this toward the end of the interview when he asks whether patients should receive a cut of the profits from commercial data sharing. The Clarivate rep doesn’t offer much of a response. Obviously, if there’s no mechanism for tracking individual patient consent, it follows that there’s also no mechanism for tracking individual patients in order to compensate them. But what’s also missing here is any consideration of community-level compensation, e.g. sharing the profits with a charity that could reflect the collective interests and values of patients whose data is being exploited. Maybe brokers could use half of their profits to fund care for the underinsured, or scientific research, or maybe even research into health data privacy?

Privacy is an key element of what it means to be an autonomous human. Big Tech’s attitude, famously articulated by Sun Microsystems’ CEO back in 1999 as “You have zero privacy anyway. Get over it.” was’t defensible 25 years ago, and it’s even less defensible now that we see how easily companies like Meta and Google are able to track our every move. Trafficking in highly sensitive data such as health data is even worse. While we’re waiting for legislative solutions (and I suspect we’ll be waiting a long time in the US; the Europeans at least have GDPR), we ought to put pressure on hospitals and clinics and health systems to enact strong and comprehensive patient privacy controls. Data brokers are bound by whatever contractual restrictions their data providers place on them. It’s the hospitals, clinics, and health systems who are the primary stewards of personal health data, and it’s past time for them to step up to this responsibility. As Paul DeMuro and Carolyn Petersen laid out in this 2019 article, their data relationship with patients should be as health data fiduciaries, and not just as transactional providers of medical services..

Subscribe to Hippocratic Capitalism