There is growing interest in monetizing health data by data custodians. The business drivers are varied. In all cases, it is seen as a valuable asset that can generate revenue for the organization. In fact, PwC estimated in 2009 that the market value for health data (mainly claims-based data) purchased by life science companies and others was over $6 billion, while the market for data supporting outcomes studies was more than $900 million.
This data monetization trend is relevant to privacy officers for two reasons:
- Privacy officers are increasingly being brought into business conversations about data monetization. It is important for them to understand that landscape to allow them to provide constructive input that would enable this to happen responsibly.
- Data monetization raises a significant number of privacy issues, which we will address below.
If your company has health data and it is looking for ways to monetize it, here are a few lessons learned about data monetization that can help you on your journey. This is not everything that you will need to know, but the insights may be useful to consider along the way:
(1) Understand the value of your data: All data are not created equal.
The monetary value of a data set will vary. Some data custodians assume significant valuations and are surprised by what the market will bear. This valuation will depend on the type of data set. Some data set types are readily available in the market, and therefore their monetary value is not as high as you might expect them to be. The workhorse in the health data market is claims data as that had been available electronically for quite a few years. Another type of data that is common is pharmacy. For example, general pharmacy data has been available for quite a few years and there is an abundance of data sources. However, certain pharmacy data sets may be more valuable to specific data consumers; for example, a pharmacy data set that provides very meaningful and timely competitive intelligence to a drug manufacturer is worth a lot to that manufacturer. In general, clinical data sets, especially disease-specific clinical data sets, are harder to come by and will therefore likely have higher valuations.
(2) Integrated data are more valuable than single domain data.
Being able to integrate data from multiple domains is more valuable than data from a single domain. For example, pharmacy data combined with claims and clinical data would be much more valuable than a data set from any one of those domains individually. Integrated data provides a more complete picture of a patient’s journey in terms of breadth and depth, and allows for more sophisticated analytics. The business units understand this and will therefore want to monetize integrated data sets from their own units or across units. Of course, as more domains are integrated, the privacy risks also increase.
(3) Analytics are more valuable than data.
You need to think of data as raw materials. Refining these raw materials into more sophisticated end-products has much more value than the raw materials. This means that providing analytics or information products will be more valuable than monetizing the raw data. The type of analytics will of course depend on the target audience. These can range from simple benchmarking services to describing or predicting industry trends to more specific patient group or product analyses. Providing online capabilities for the end-users to perform their own analytics also can be very valuable because this allows the analysts to run their own customized queries on the underlying database.
(4) There are benefits to using public data.
There is increasingly more data being made available publicly. An integration of only public data sets will in some cases be valuable by itself. It is sometimes difficult to get access to data that is publicly available; e.g., because it needs to be formatted or coded to be useful. Being able to pull multiple data sets together can provide useful analytic capability to some users. Also, integrating public data with private data can enhance the value of the private data. For example, census data containing socioeconomic information about individuals can be linked with a clinical file at the ZIP code level to produce a larger, more complex data set. It is not necessarily the case that adding public data increases the privacy risks, but this assumes that the individual public data sets have been de-identified properly.
(5) Selling data is not the only way to monetize it.
Sometimes data can be bartered for services or other benefits. For example, we have seen hospitals provide their data sets in exchange for free benchmarking services. The data acquirer then has the option to monetize that data set in another way after providing the benchmarking service. But the hospital does not receive a direct financial gain from sharing the data. Therefore, “monetization” does not always require the exchange of money.
(6) Most analytics are quite simple.
Most analytics that are performed on data sets are quite simple, such as summary or descriptive statistics, cross-tabulations and basic bivariate relationships. The key point is that an analytics engine that sits on top of a data set needs to only provide simple statistics to meet the needs of the vast majority of data users. A smaller percentage of data users will want to apply more sophisticated data models to the underlying data.
(7) Defensible de-identification is critical.
Proper de-identification techniques need to be used to protect patient privacy. If raw data files are being shared, then they must be de-identified. If data is being analyzed through an online analytics engine, then that underlying data needs to be de-identified. Suppressing cells on the fly, adding noise on the fly or other on-the-fly masking is not going to be sufficient to protect privacy without imposing unacceptable constraints on the data users. This is a necessary condition to monetize data responsibly.
(8) Transparency is important.
The public is concerned about how their health data is being shared, and they tend not to be fond of others profiting from their personal data. Being transparent about how their data is being used and explaining the benefits and services that the data custodian can provide as a result helps to maintain patient trust. A hospital monetizing its data would be prudent to provide notice to patients that their data is being shared, for example, to support clinical trials that test new drugs and that this exchange also provides needed funds to support the services provided by the hospital. The data custodian may even impose certain conditions on data uses when sharing data.
(9) Consider restrictions on data use.
It is important to understand what the data uses will be. At the end of the day, the data pertains to your patients or customers, and it is important that these uses would not be surprising to them, stigmatizing or discriminatory. To make these determinations some privacy ethics council needs to review data uses by the data acquirers and restrictions may need to be put in the agreements and contracts.
(10) Make data available for research.
Providing societal value back from the use of health data is also a good way to maintain patient trust. For example, a data buyer may provide access to consolidated data to qualified researchers for free or at a reduced rate in order to ensure that valuable work, which is seen as having a direct benefit to society, is performed with the data. This could even be one of the conditions imposed by the data custodian for providing the data.
Health data are very valuable and can provide many benefits to society and commercially. But cleaning the data, documenting it, de-identifying it and sharing it requires resources that need to be financed in some way. Monetization of data can be an effective way to cover the associated costs of making health data available. But this needs to be done responsibly to protect individual privacy and ensure the maturation of the data market.
If you want to comment on this post, you need to login.