Pseudonymisation: the benefits and getting regulation right
Pseudonymisation bridges the gap between personal and anonymous data, being personal data under EU law but data that is at least difficult to link to a particular individual. The use of this type of data has, according to proponents, multiple benefits, but in the context of the draft EU Data Protection Regulation, the regulatory landscape for pseudonymisation is not yet fully decided. Victoria Hordern, Legal Director at Field Fisher Waterhouse LLP, describes the debate around regulating pseudonymisation, and argues that to disproportionally burden businesses seeking to properly use this kind of data could be counterproductive.
Using a pseudonym has come a long way since Currer, Ellis and Acton Bell. Although, to some extent, the principle hasn't changed. The Brontë sisters didn't want to be immediately identified as Charlotte, Emily and Anne when they published their first book of poems just as those using pseudonymising technology today don't want to identify the individuals within the data. Such organisations are not so much interested in actual individuals (their names etc.) but in human behaviour and patterns. So organisations involved in medical and social research typically collect pseudonymised data in order to pioneer new techniques to prevent diseases and understand behavior and not to assess the circumstances of particular individuals. The fraud prevention and online advertising industries also significantly rely on the use of pseudonymous data to carry out their data processing activities and provide services to clients.
But what exactly is pseudonymisation? It's certainly not a new legal concept in the EU as it's already recognised under German law where the concept of pseudonymisation or aliasing means replacing the individual's name and other identifying features with another identifier in order to make it impossible or extremely difficult to identify the individual. But although it already exists under German law it has provoked warm debate as part of discussions around the draft EU Data Protection Regulation.
Most people think of pseudonymous data as a third category of data. You have (i) personal data, (ii) anonymous data and then (iii) pseudonymous data. Personal data is what it is (and we're not going to get stuck into that debate here) and anonymous data cannot be personal data. But pseudonymous data bridges both concepts. So it is personal data (due to the broad definition of personal data under EU law) and yet it's not possible (or it is at least extremely difficult) to link or identify the data with the particular individual to whom the data relates. Consequently, there's a strong argument that the risk of a privacy intrusive action impacting the individual is considerably lower.
Using pseudonymous data is commonplace for online businesses, who remove data such as an IP address, which can single out a person, and replace it with a machine-generated identifier. However, technology being what it is, even this activity does not guarantee non-identifiability. Increasingly the power of technology allows disparate data which does not include identifiers to be linked up so that identities can be revealed. It's all about how granular the data gets and how turbo-charged the algorithms are that make connections. And yet it's possible to mitigate some of the risks of this occurring by instilling functionality, regular checks and processes so that the likelihood of identification across a big pool of data is reduced.
There is widespread industry support for pseudonymisation given that it can help to minimize privacy risks and therefore reduce the potential for non-compliance. Furthermore the availability of pseudonymisation techniques can encourage greater innovation. Data-rich organisations can crunch big data without the added risk of using identifiable personal data.
Pseudonymisation also benefits individuals. If you don't want to reveal your full identity online, you can retain a degree of anonymity. Additionally, an organization deploying pseudonymisation is restricted according to the types of data it can collect thus reducing the risk of privacy intrusiveness for individuals. One way for an organisation to demonstrate that it is serious about the privacy implications for individuals is to use pseudonymisation.
If there are clear industry and individual benefits to pseudonymisation, surely the regulatory landscape should incentivise its use? In other words, it should be easier not harder for organisations to use pseudonymous data. Regulation of pseudonymous data should therefore be pragmatic and not unduly restrictive. Which leads us back to the draft EU Data Protection Regulation and the proposed amendments to regulate pseudonymous data.
The original EU Commission's January 2012 draft of the Data Protection Regulation did not actually include a reference to pseudonymous data. It was introduced by the rapporteur for the EU Parliament's Civil Liberties, Justice and Home Affairs (LIBE) Committee whose draft report on the amendments was published in December 2012. Suddenly pseudonymous data appeared as a defined concept on the EU-wide regulatory landscape and the proposal was to regulate it with the full force of data protection legislation. Some in the scientific community were duly alarmed that these amendments would prevent or seriously impair scientific research studies due to the increased regulatory burden. They emphasised that the use of pseudonymised data in scientific research should be regulated proportionately.
Subsequently the EU Parliament's LIBE Committee voted on a compromise draft text of the Regulation in October 2013 where the definition of pseudonymous data is now‘personal data that cannot be attributed to a specific data subject without the use of additional information, as long as such additional information is kept separately and subject to technical and organisational measures to ensure non-attribution.’ This partly echoes the view of the European Data Protection Supervisor (EDPS) who has stated his preference for a definition of pseudonymous data that builds on the blocks of (i) information relating to a natural person who can be identified directly or indirectly, (ii) the means that may be used to identify the person are effectively separated from the data, and (iii) the identification by unauthorized persons is effectively prevented.
Significantly whereas the EDPS warns against the exemption of pseudonymous data from core data protection principles, the LIBE's October 2013 text allows that if the controller is unable to comply with a provision of the Regulation because the controller is processing pseudonymous data, the controller is not obliged to comply with that particular provision (this point may be tweaked given that the President of the Council's latest comments on pseudonymisation indicate that a controller cannot refuse to honour an individual's rights where the individual himself provides the additional information enabling identification). Additionally, the October 2013 text includes a new recital stating that profiling based solely on pseudonymous data processing is presumed not to significantly affect the interests, rights or freedoms of the individual and as such, by implication, an organisation can rely on the legitimate interest ground to process that data. The October 2013 text therefore marks a more flexible approach towards pseudonymisation than has been seen so far. This has prompted some privacy activists to argue that the safeguards are insufficient while, on the other hand, some in the online advertising industry still consider this latest draft to be unworkable in its current form and lacking legal certainty.
One hopes that within the trilogue (the informal discussions between the EU Parliament, Council and Commission about the proposed legislation), the EU institutions retain some perspective. Regulation should be proportionate to the potential risk. Although re-identification from pseudonymous data may be technically possible, where robust operational and contractual conditions have been established to minimise the opportunity of reidentification this should be acknowledged and incentivised. It would be counterproductive for the law to impose disproportionate burdens on organisations using data in a way that poses a very limited risk of privacy intrusiveness.
EU Commissioner Neelie Kroes appears to appreciate this argument. She recently gave her backing to organisations that wish to rely on legitimate interest rather than consent when using pseudonymised data. In her view, legitimate interest should be the available ground for processing pseudonymous data so long as organisations remain accountable and still ensure that their internal processes and risk assessments comply with the guiding principles of data protection law. Big data proponents will further argue that where an organisation analyses pseudonymous data to draw out trends and there is no detrimental impact on individuals or intention to single someone out, it is disproportionate to require an organisation to comply with the full remit of data protection compliance obligations. Clearly a balance will need to be struck in order for big data processing to be incentivised under EU rules.
Neelie Kroes may be receptive to these arguments but it remains to be seen whether other key movers in the trilogue negotiations will accord pseudonymous data a lighter-touch regulatory approach. But, in any event, technology marches on even if the timetable for the new Regulation is now delayed. Pseudonymous data will become more and more widely used and therefore providing a proportionate, flexible, though privacy conscious, framework for the use of this third category of data will be crucial.