“Imagine you walk into a physician’s office with some strange things happening to you,” says Ricardo Pietrobon, M.D. “Unknown to you and your doctor at that point you have a rare disease that will kill you if it goes without a diagnosis and appropriate treatment. At that moment,” he continues, “your life and wellbeing rely almost entirely on the information the physician might have inside her brain: her previous experience with similar patients and what she might have read in the scientific literature. If any of those fail her and she makes a bad decision, you are now either dead or facing a miserable life until your last day.”
But it doesn’t have to be that way, Dr. Pietrobon told the CIMS Fall 2012 Conference. As Associate Professor of Surgery and Associate Vice Chair for Research Processes and Innovation at Duke University Medical Center, he and colleagues have been running experiments in applying “Big Clinical Data” to support medical decisions. Here he summarizes his presentation to the CIMS Conference (http://goo.gl/8n3Az).
Welcome to the age of decision support systems aided by big clinical data. By combining 1) the huge amount of data collected by hospitals in electronic health records, 2) massive numbers of scientific articles available from the international research community, 3) the personal experience from your doctor, and 4) your own choices, it is now becoming possible to have the best of each of these worlds in a system that will assist you and your doctor reach the best possible course of action.
Our group at Duke, in conjunction with CIMS, is attempting to tackle small parts of this massive problem using methods borrowed from what is now known as Big Clinical Data (1). Although there are probably as many definitions of what Big Clinical Data means as there are people attempting to define it, my current definition is that it represents a set of activities related to 1) data storage, linkage and organization, 2) modeling analysis or trying to extract useful information out of a vast amount of numbers and text, and 3) making the information ready to be used in a valuable manner.
Here are three samples of the work we are doing in each of these areas, with a preview of peer-reviewed publication that will be coming out later in 2013.
The first example relates to what our academic group calls “data enrichment.” Traditionally, different patient databases could not be aggregated unless both of them were talking about the same individual. For example, if Joe’s information about his surgery was in a given data set and his cost data in another, I would be able to bring the two together. But if information about Joe’s high blood pressure had not been collected, researchers would be out of luck in knowing whether he had high blood pressure or not. With data enrichment techniques, our group is now creating a very large cloud of immediately accessible clinical data sets that allows biomedical researchers to find the best guesses to know, for example, whether Joe had high blood pressure or not.
The way this works is fairly intuitive: We first look up databases containing information on patients who are identical to Joe in relation to a number of conditions, say his other diseases, age, gender, geographic location, and race/ethnicity. Based on these common, matching characteristics, we define what we call “Joe’s twins,” or people who are identical to Joe. The premise is that if these people are similar to Joe in all of these characteristics, they are also going to be similar in relation to having or not having high blood pressure, ultimately allowing us to determine our best guess on Joe’s high blood pressure.
While the method itself is no rocket science, what is new is the method that allows researchers to have a multitude of databases at their fingertips allowing them to make those guesses, something that was not possible before.
Collecting Care Quality Data
The second example relates to monitoring healthcare quality using cost measures. The concept is simple: Most hospitals and clinics are simply not equipped to collect data on the quality of care provided to patients. They will usually require dedicated personnel, databases, and a significant amount of the time healthcare professionals could otherwise be spending with patients. In contrast, every single hospital or clinic collects cost data.
The novel concept is that cost data can be used as a thermometer for quality of care. For example, when a group of patients in a given hospital start having a higher post-operative complication rate or requiring unexpected readmissions to the hospital, the costs for these patients will increase significantly.
Looking now from a detection perspective, every time an administrator sees a greater degree of variability, a red flag should go up indicating the need for a deeper investigation into its causes. If the reason for variability is related to healthcare quality, then an opportunity is created to address them. In our group, this type of analysis is conducted with a host of open source tools, the central one being the statistical language R (2).
Data-Driven Decision Support
The third example is related to data-driven decision support at the point of care. Duke currently hosts anonymized data from close to 40 electronic health records around the country. In a project led by Dr. Kenneth Gersing at Duke and Dr. Ketan Mane at Renaissance Computing Institute (RENCI), we now use this massive data set to predict which medications will lead to the best results for individual patients. For example, imagine a patient with depression who might not be doing well. She comes to the office and this predictive system analyzes her past, compares her to a massive data set, and then identifies a medication that will be the best choice for her.
Emphasis on Support
There is really no going back when embarking upon the Big Clinical Data path to decision support. That being said, it is important to emphasize the support concept. In other words, data does not drive decisions and is no replacement for common sense and experience.
While the pharma industry is already actively engaging with Big Clinical Data, healthcare is just beginning to contemplate its use. In order to get there, two steps are essential. First, data should be made interoperable, ultimately ensuring that different data sets use similar standards so that they can be combined. It is only when massive data sets are assembled that the full power of data discovery is realized.
Finally, it is important to recognize that simply analyzing data is not enough. One has to be willing to listen to what the data says and then — the most important point — act upon it.
Ricardo Pietrobon, MD, PhD, MBA
Duke University Medical Center