The avalanche of COVID-19 applications developed to manage the pandemic has caused debates over the balance of public interest and the basic human right to privacy. COVID-19 apps, such as digital contact-tracing apps and statistical data analysis tools that help identify patterns that could lead to cures or prevention measures, have understandably led to the mixed emotion of excitement and concern. There is an excitement that experts could process the data to identify ways to curb the virus and concern that the apps could be used for surveillance or reveal sensitive personal data.
The development of the apps with an acceptable level of data protection is probably the second most important invention related to COVID-19, the first being vaccines. Unprecedented events have occurred at different levels in the development of these apps. For example, at the technical level, Google and Apple joined forces to develop a solution instead of competing against each other.
At the political level, some countries have used contact-tracing apps for monitoring those who break confinement laws, while some have expressed serious concerns over privacy invasion. This map shows where the countries across the world were in terms of contact tracing as of October 2020.
While technical challenges are easier to overcome, the key challenge lies in ensuring information protection and subsequently assuring society that the apps are indeed tracing the virus and not the people.
There is an underlining cryptographic concept called “homomorphic encryption” that has been around since the 1970s. If implemented in its full form, HE could simplify and shorten the debates around public interest versus privacy. The aim of this article is to explain HE with day-to-day relatable examples. After all, the background knowledge of cryptographic schemes can empower data subjects to make informed decisions when entrusting apps with their personal data.
What is homomorphic encryption?
Encrypting data is similar to placing it in a box and locking it so that it can only be opened (i.e., decrypted) by those that hold the key to the box. Encrypted data is also referred to as ciphertext.
Using the box analogy, encrypting data at rest is when the box is locked and placed in a storage facility, while encryption in transit is moving the box from one facility to another in an armored truck. Modern cryptography has done a great job in protecting data at rest and in transit using different encryption algorithms with varying levels of strength (similar to how the strength of each box and padlock may differ).
Encrypting data in use also referred to as encrypting data during computation, represents the ability to use the data without having to take it out of the box. Imagine how much easier it would be to perform an important calculation needed to assess the risks of COVID-19 without revealing the personal data of individuals with this feature.
HE is achievable in its current partial form has been in use for decades. With partial form, we mean that some calculations can be encrypted while others cannot. To elaborate, partial HE enforces a choice between addition or multiplication but not both. For more complex equations, end-to-end encrypted data in use is not possible. This means we may need to decrypt the data at known points of the calculation.
While privacy risks from decrypting and re-encrypting data could be addressed through compensating mitigations, these measures will not be as privacy-preserving as fully homomorphic encryption, which allows for complex calculations to occur end-to-end with encrypted data. This concept of FHE was barely a dream until Craig Gentry made it a reality in 2009. While FHE is not in practical use today, the research and developer communities are heading in the right direction into making it possible to use.
Below are some simple use cases of HE in partial and full form
- Additive HE. A university is performing statistical analysis on COVID-19 infections and needs to know the number of patients who died from the virus across the country. The university requests all hospitals to encrypt the number of COVID-19 deaths in their hospital. The university retains the key for decryption.
Each hospital encrypts their number and then all hospitals jointly perform the addition of the encrypted numbers to create one ciphertext, which is the sum of all the encrypted values. They send the result to the university who in turn decrypts the result to get the total number of COVID-19 deaths, without knowing the number of fatalities per hospital and without one hospital knowing the number of another.
- Multiplicative HE. A commune wants to calculate the risk score of their region to plan for the needed medical resources. To do so, the commune needs to multiply different risk values together. Risk values can come from different sources (e.g., medical history of residents from hospitals, location data from mobile providers, sick leaves from companies, etcetera). The commune requests each source to encrypt the risk values. The commune retains the key for decryption.
All sources jointly perform the multiplication of their encrypted risk values to create a cipher result. The product result is sent to the commune to decrypt. The process will allow sources to know their risk value only and the commune to only learn the overall risk score without learning the individual risk values.
- Fully HE. The university wants to know the percentage of COVID-19 patients who died after being admitted to the hospital. The university requests all hospitals encrypt the number of COVID-19 deaths and the number of COVID-19 patients admitted. The university retains the key for decryption.
The hospitals will perform the additions of the encrypted values to get the encryption of the total deaths and total patients. These encrypted values would then need to be divided to be able to calculate the percentage. FHE can enable the end-to-end calculation to take place without having to decrypt the data in between. Eventually, the university will be able to decrypt the overall percentage.
FHE is a very powerful tool for privacy-preserving outsourced computation. If FHE could be used, the design of a COVID-19 tracing or statistics app could be much simpler, triggering fewer debates. This form of encryption is, however, not used practically today due to the computational power required.
Nevertheless, eminent IT companies, such as IBM, have released toolkits for developers to build programs based on FHE, and startups, such as Zama, are creating groundbreaking results in the field. Hopefully, it will not be too long a wait until a feasible and scalable solution is available. Until then, we have HE which can at least partially help achieve the goal of data protection in use; to us, this is a glass-half-full rather than half-empty.
Photo by Markus Spiske on Unsplash