Why big data for healthcare is dangerous and wrong

admin

June 8, 2012

The Mckinsey Global Institute recently published a report entitled – Big data: The next frontier for innovation, competition, and productivity .

The Mckinsey Global Institute report on big data is no more than a lengthy essay in fallacies, inflated hyperbole, faulty assumptions, lacking in evidence for its claims and ignoring the two most important stakeholders of healthcare – namely doctors and patients.
They just gloss over the security and privacy implications of putting up a big big target with a sign that says “Here is a lot of patient healthcare data – please come and steal me“.

System efficiency does not improve patient health

In health care, big data can boost efficiency by reducing systemwide costs linked to undertreatment and overtreatment and by reducing errors and duplication in treatment. These levers will also improve the quality of care and patient outcomes.
To calculate the impact of big-data-enabled levers on productivity, we assumed that the majority of the quantifiable impact would be on reducing inputs.
We held outputs constant—i.e., assuming the same level of health care quality. We know that this assumption will underestimate the impact as many of our big-data-enabled levers are likely to improve the quality of health by, for instance, ensuring that new drugs come to the market faster…
They don’t know that.
The MGI report does not offer any correlation between reduction in systemwide costs and improving the quality of care of the individual patient.
The report deals with the macroeconomics of the pharmaceutical and healthcare organization industries.
In order to illustrate why systemwide costs are not an important factor in the last mile of healthcare delivery, let’s consider the ratio of system overhead to primary care teams in Kaiser-Permanente – one of the largest US HMOs. At KP, (according to their 2010 annual report) – out of 167,000 employees, there were 16,000 doctors, and 47,000 nurses.
Primary care teams account for only 20 percent of KP head-count. Arguably, big-data analytics might enable KP management to deploy services in more effective way but do virtually nothing for the 20 percent headcount that actually encounter patients on a day to day basis.

Let’s not improve health, let’s make it cheaper to keep a lot of people sick

Note the sentence – “assuming the same level of health care quality”. In other words, we don’t want to improve health, we want to reduce the costs of treating obese people who eat junk food and ride in cars instead of walking instead of fixing the root causes. Indeed MGI states later in the their report:
Some actions that can help stem the rising costs of US health care while improving its quality don’t necessarily require big data. These include, for example, tackling major underlying issues such as the high incidence and costs of lifestyle and behavior-induced disease.

Lets talk pie in the sky about big data and ignore costs and ROI

…the use of large datasets has the potential to play a major role in more effective and cost-saving care initiatives, the emergence of better products and services, and the creation of new business models in health care and its associated industries.
Being a consulting firm, MGI stays firmly seated on the fence and only commits itself to fluffy generalities about the potential to save costs with big data. The terms ROI or return on investment is not mentioned even once because it would ruin their argumentation. As a colleague in the IT division of the Hadassah Medical Organization in Jerusalem told me yesterday, “Hadassah management has no idea of how much storing all that vital sign from smart phones will cost. As a matter of fact, we don’t even have the infrastructure to store big data”.
It’s safe to wave a lot of high-falutin rhetoric around about $300BN value-creation (whatever that means), when you don’t have to justify a return on investment or ask grass-level stakeholders if the research is crap.
MGI does not explain how that potential might be realized. It sidesteps a discussion of the costs of storing and analyzing big data, never asks if big data helps doctors make better decisions and it glosses over low-cost alternatives related to educating Americans on eating healthy food and walking instead of driving.

The absurdity of automated analysis

..we included savings from reducing overtreatment (and undertreatment) in cases where analysis of clinical data contained in electronic medical records was able to determine optimal medical care.
MGI makes an absurd assumption that automated analysis of clinical data contained in electronic medical records can determine optimal medical care.
This reminds me of a desert island joke.
A physicist and economist were washed up on a desert island. They have a nice supply of canned goods but no can-opener. To no avail, the physicist experiments with throwing the cans from a high place in the hope that they will break open (they don’t). The economist tells his friend “Why waste your time looking for a practical solution, let’s just assume that we have a can-opener!”.
The MGI report just assumes that we have a big data can-opener and that big data can be analyzed to optimize medical care (by the way, they do not even attempt to offer any quantitive indicators for optimization – like reducing the number of women that come down with lymphema after treatment for breast cancer – and lymphedema is a pandemic in Westerm countries, affecting about 140 million people worldwide.

In Western countries, secondary lymphedema is most commonly due to cancer treatment.Between 38 and 89% of breast cancer patients suffer from lymphedema due to axillary lymph node dissection and/or radiation.See :
^ Brorson, M.D., H.; K. Ohlin, M.D., G. Olsson, M.D., B. Svensson, M.D., H. Svensson, M.D. (2008). “Controlled Compression and Liposuction Treatment for Lower Extremity Lymphedema”. Lymphology 41: 52-63.

^ Brorson, M.D., H.; K. Ohlin, M.D., G. Olsson, M.D., B. Svensson, M.D., H. Svensson, M.D. (2008). “Controlled Compression and Liposuction Treatment for Lower Extremity Lymphedema”. Lymphology 41: 52-63.

^ Brorson, M.D., H.; K. Ohlin, M.D., G. Olsson, M.D., B. Svensson, M.D., H. Svensson, M.D. (2008). “Controlled Compression and Liposuction Treatment for Lower Extremity Lymphedema”. Lymphology 41: 52-63.

^ Kissin, MW; G. Guerci della Rovere, D Easton et al (1986). “Risk of lymphoedema following the treatemnt of breast cancer.”. Br. J. Surg. 73: 580-584.

^ Segerstrom, K; P. Bjerle, S. Graffman, et al (1992). “Factors that influence the incidence of brachial oedema after treatment of breast cancer”. Scand. J. Plast. Reconstr. Surg. Hand Surg. 26: 223-227.

More is not better

We found very significant potential to create value in developed markets by applying big data levers in health care. CER (Comparative effectiveness research ) and CDS (Clinical decision support) were identified as key levers and can be valued based on different implementations and timelines
Examples include joining different data pools as we might see at financial services companies that want to combine online financial transaction data, the behavior of customers in branches, data from partners such as insurance companies, and retail purchase history. Also, many levers require a tremendous scale of data (e.g., merging patient records across multiple providers), which can put unique demands upon technology infrastructures. To provide a framework under which to develop and manage the many interlocking technology components necessary to successfully execute big data levers, each organization will need to craft and execute a robust enterprise data strategy.
The American Recovery and Reinvestment Act of 2009 provided some $20 billion to health providers and their support sectors to invest in electronic record systems and health information exchanges to create the scale of clinical data needed for many of the health care big data levers to work.

Why McKinsey is dead wrong about the efficacy of analyzing big EHR data

The notion that more data is better (the approach taken by Google Health and Microsoft and endorsed by the Obama administration and blindly adopted by MGI in their report.
EHR is based on textual data, and is not organized around patient clinical issue.

Meaningful machine analysis of EHR is impossible

Current EHR systems store large volumes of data about diseases and symptoms in unstructured text, codified using systems like SNOMED-CT¹. Codification is intended to enable machine-readability and analysis of records and serve as a standard for system interoperability.

Even if the data was perfectly codified, it is impossible to achieve meaningful machine diagnosis of medical interview data that was uncertain to begin with and not collected and validated using evidence-based methods.

More data is less valuable for a basic reason

A fundamental observation about utility functions is that their shape is typically concave: Increments of magnitude yield successively smaller increments of subjective value.²
In prospect theory³, concavity is attributed to the notion of diminishing sensitivity, according to which the more units of a stimulus one is exposed to, the less one is sensitive to additional units.

Under conditions of uncertainty in a medical diagnosis process, as long as it is relevant, less information enables taking a better and faster decision, since less data processing is required by the human brain.

Unstructured EHR data is not organized around patient issue

When a doctor examines and treats a patient, he thinks in terms of “issues”, and the result of that thinking manifests itself in planning, tests, therapies, and follow-up.
In current EHR systems, when a doctor records the encounter, he records planning, tests, therapies, and follow-up, but not under a main “issue” entity; since there is no place for it.
The next doctor that sees the patient needs to read about the planning, tests, therapies, and follow-up and then mentally reverse-engineer the process to arrive at which issue is ongoing. Again, he manages the patient according to that issue, and records everything as unstructured text unrelated to issue itself.
Other actors such as national registers, extraction of epidemiological data, and all the others, all go through the same process. They all have their own methods of churning through planning, tests, therapies, and follow-up, to reverse-engineer the data in order to arrive at what the issue is, only to discard it again.
The “reverse-engineering” problem is the root cause for a series of additional problems:

Lack of overview of the patient
No connection to clinical guidelines, no indication of which guidelines to follow or which have been followed
No connection between prescriptions and diseases, except circumstantial
No ability to detect and warn for contraindications
No archiving or demoting of less important and solved problems
Lack of overview of status of the patient, only a series of historical observations
In most systems, no search capabilities of any kind
An excess of textual data that cannot possibly be read by every doctor at every encounter
Confidentiality borders are very hard to define
Very rigid and closed interfaces, making extension with custom functionality very difficult

Summary

MGI states that their work is independent and has not been commissioned or sponsored in any way by any business, government, or other institution. True, but MGI does have consulting gigs with IBM and HP that have vested interests in selling technology and services for big data.

The analogies used in the MGI report and their tacit assumptions probably work for retail in understanding sales trends of hemlines and high heels but they have very little to do with improving health, increasing patient trust and reducing doctor stress.

The study does not cite a single interview with a primary care physician or even a CEO of a healthcare organization that might support or validate their theories about big data value for healthcare. This is shoddy research, no matter how well packaged.
The MGI study makes cynical use of “framing” in order to influence the readers’ perception of the importance of their research. By citing a large number like $300BN readers assume that impact of big data is well, big. They don’t pay attention to the other stuff – like “well it’s only a potential savings” or “we never considered if primary care teams might benefit from big data (they don’t).

At the end of the day, $300BN in value from big data healthcare is no more than a round number. What we need is less data and more meaningful relationships with our primary care teams.

1ttp://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html

2 Current Directions in Psychological Science, Vol 14, No. 5 http://faculty.chicagobooth.edu/christopher.hsee/vita/Papers/WhenIsMoreBetter.pdf

3 http://www.princeton.edu/~kahneman/docs/Publications/prospect_theory.pdf