To keep or not to keep: data retention challenges and solutions

30/07/2018

Locations

Every organisation knows that they should not keep data for longer than is "necessary", but determining what this means in practice can be very challenging. This blog discusses some of those challenges and the principles that should guide the creation of any data retention program.

When asking someone a question that has no clear answer, you’ll often receive the response: “how long is a piece of string?” It’s a response that privacy professionals may be tempted to give when asked the question “how long can we retain data for?” - those not steeped in the detail of privacy regulation may feel that this question should have a simple, straightforward answer, expressed in a matter of months or years; the reality is far more complicated than that.

It’s been a longstanding principle of European data privacy law that data should be held for “no longer than is necessary”. The GDPR does not specify exact data retention timescales, and the reason for this - when you stop to think about it - is obvious: the periods for which you can justifiably keep data are necessarily context-specific. Ask a regulator what “no longer than is necessary” means, and they’ll tell you that you should process data only for the period of time needed to execute the purpose for which it was originally collected. The problem is that this assumes that each data element is collected only for a single purpose (or perhaps a small number of discrete purposes), and that this purpose was immediately apparent at the outset. This is seldom the case.

Let’s take a simple example: say you operate a B2C retail website and collect somebody’s data for order fulfilment purposes. Someone places an order, you fulfil the transaction, and deliver the goods. Has the ‘necessity’ of the data processing been exhausted, meaning that the data should be deleted? What if some time later that individual were to dispute the transaction - perhaps claiming you overcharged, dispatched the wrong item, or delivered fewer than the number of goods ordered? If that happens, you will likely need the data to respond to the dispute and, potentially, defend any subsequent litigation. Or perhaps there’s a product recall, and you need to e-mail customers who bought those items in order to communicate the recall. Or maybe you just need to keep the data for accounting or tax reasons. Each of these scenarios presents a potential ‘necessity’ to retain the data, and each of them have potentially different retention time periods associated with them - ranging from the very short (delivery of the goods) to the very long (statutory limitation periods for litigation).

The issue then becomes further complicated if you operate across borders. Taking the same above example, imagine that the business decided to retain data for as long as is “necessary” to defend potential future legal claims - this means retaining the data for the duration of the relevant statutory limitation period (for the non-lawyers among you, that’s the maximum period of time specified by law that an organisation can be sued). But statutory limitation periods are set by national laws - meaning that if you do business across multiple jurisdictions (and there are currently 31 in the European Economic Area), you need to understand the different statutory limitation periods in each of those countries and set country-specific retention periods accordingly.

We’re just talking about only one set of data in the above example though. In addition, and to the extent they comprise personal data, you also have to determine retention periods for your corporate records, your HR records (current employees, ex-employees, contingent workers, and candidates), your finance records, your real estate records, your wider customer records, you marketing and sales records, your procurement records and so on. It’s understandable that organisations will turn to their privacy professionals thinking they will have an ‘off the shelf’ solution, but this is wishful thinking. Instead, creating a comprehensive data retention program requires an intimate understanding both of the data and its uses, and also of the relevant laws, regulations and risks that affect the business and that may mandate specific retention periods. Getting to a ‘perfect’ answer will likely entail extensive (and expensive) consultation with internal stakeholders and external experts, in each of the territories concerned, and result in generating a lengthy, granular, record-by-record, country-by-country data retention schedule that, ultimately, risks being too complex for anyone to use in practice. The flaw in the pursuit of perfection, if you will.

Given that, what do you do? To my mind, there are a few key principles that should guide the creation of any data retention program:

First, recognise that you need a data retention program. Some organisations, faced with the complexity of establishing a data retention program, may choose simply to ignore or postpone the problem. That’s not a good response, and holding on to your data indefinitely is wholly legally indefensible in the face of a regulatory or legal challenge. Further, if you experience a data security incident or a data subject rights request (e.g. a subject access request), then you will be sitting on top of an awful lot more affected data - and the resultant risks, costs, and negative PR in responding to the incident or request will be substantially greater. Recognising that you need a data retention program, and getting to work asap, is therefore imperative.
Second, don’t let the great be the enemy of the good. Any data retention program, even if imperfect, is better than no data retention program. You can think of possible approaches to data retention as sitting on a spectrum - with indefinite data retention (not good) sitting on one end of the spectrum; and a perfect, comprehensive, granular program sitting at the other end of the spectrum. The reality for most organisations is that they will sit somewhere in the middle of that spectrum, and chasing down a near-impossible goal of perfection only risks delaying getting a ‘good enough’ program off the ground. With that in mind, decide whereabouts on the spectrum you intend to sit and design your program and retention timescales accordingly. For example, do you apply a single, one-size-fits all retention period across all of your customer data records, or will you have distinct retention periods for different classes of customer data - e.g. customer profile data, marketing data, transaction data, analytics data and so on. Same issue across countries - do you apply country-specific retention periods, apply retention periods across groups of countries (e.g. EMEA, North America, LATAM etc.), or apply one-size-fits-all retention periods across all countries?
Third, if you need to retain some data for particularly lengthy periods of time (e.g. product improvement or machine learning), then consider anonymising the data first. Remember that data protection laws - and so the requirement to retain data for “no longer than is necessary” - apply only to personal data. Data which is not personal falls outside of data protection law and so, in principle, can be retained indefinitely. Anonymisation throws up its own challenges, especially given European data protection authorities’ strict views on what qualifies as effective anonymisation, but it is for many organisations often more achievable than full deletion. If not, then at least consider pseudonymisation (e.g. hashing). Pseudonymous data still technically qualifies as personal data and so should not be processed for longer than “necessary”, but it is also inherently less intrusive than ‘ordinary’ data - so pseudonymising data may help to lower the risk of longer retention periods.
Finally, make sure you have at least some concrete justifications for why you keep data for the periods you do, rather than a vague “because it might be useful some day” type argument. Even with these justifications, your view of why it is “necessary” to keep data will likely not always align with what a regulator, data subject, or court considers “necessary”, but it at least goes some way to showing you put considered thought into your retention program - and, remembering that data protection authorities face exactly the same challenges you do in determining what are acceptable retention periods, having well thought-out justifications that you can articulate will result in a better dialogue with them should this ever prove necessary. Conversely, not having these considered justifications only suggests sloppy data governance, and that’s never a good position to be in - even less so in the event of a complaint or investigation.

All in, creating a data retention program is certainly a significant challenge for any organisation, but by keeping the above principles in mind it is a manageable one. So don’t delay - get cracking on yours today!

To keep or not to keep: data retention challenges and solutions

Locations

Areas of Expertise

Related Work Areas