LTO Ultrium, GDPR & Article 17 - the right to be forgotten and tape
One of the more challenging aspects of the General Data Protection Regulation (GDPR) for IT groups responsible for backup and archiving, concerns the data subject rights articulated in Chapter 3: in particular, Article 17, the right to erasure often referred to as the “right to be forgotten”.
Before I discuss this topic further, I want to let you know that I’m a storage professional not a lawyer, so although I feel relatively well-informed to discuss this subject, please don’t use or regard this blog as professional legal advice. And, as always, the opinions expressed here are my own, not those of my employer.
Although the GDPR itself doesn’t mention “backups” specifically, Article 4 is explicit in its definition of “processing data” that this term includes both storage and retrieval. So it seems clear that GDPR means to include backup and archiving systems within its scope.
Article 17 is challenging for backup admins for two reasons. Although it’s relatively straightforward to delete data from live production environments, many backup and archiving solutions were not built with the right to erasure in mind: they were primarily designed to keep copies of data, not make it easy to remove data. That means a lot of personal information retained in older backup sets could date back to a time when businesses didn’t have the tools, nor compelling regulatory and commercial reasons, to deploy systems capable of managing individual files.
And because these systems using were not built with this purpose in mind, there may also be challenges both in finding the right information and then being able to erase it from the backup or archive without corrupting and damaging the entire dataset. This is particularly relevant to GDPR compliance because the regulation has introduced clear timescales for responding to subject access requests - in the case of right to erasure, within one month of receipt.
As a consequence, there was a great deal of discussion when the GDPR came into force in May 2018 about how companies would comply with Article 17 requests. The storage analyst, W. Curtis Preston, summed up the situation in an article earlier this year:
But I also don’t think we’re talking about technically difficult; we’re talking technically impossible. After 25 years of experience in this field, I can easily say it is technically impossible to erase data from inside a structured database inside a backup without corrupting the backup. Almost all backups are image based, not record based. Even if you could identify the blocks pertaining to a given record inside a database, deleting those blocks would corrupt the rest of the backup. You literally cannot do that.
Since then, however, the UK’s Information Commissioners Office has clarified the situation in a way that may help clarify the regulation for IT chiefs and their teams.
Firstly, in response to the basic question: do we have to erase personal data from backup systems?
Here, the ICO says:
“If a valid erasure request is received and no exemption applies then you will have to take steps to ensure erasure from backup systems as well as live systems. Those steps will depend on your particular circumstances, your retention schedule (particularly in the context of its backups), and the technical mechanisms that are available to you.”
It appears, therefore, that the obligation to erase data extends to both live systems and nearline / offline backups. The ICO instructs companies to be transparent with individuals what will happen when their erasure request is fulfilled, including in respect of backup systems.
So what if the data cannot easily be retrieved or erased from a backup or archive?
Then the ICO continues:
“It may be that the erasure request can be instantly fulfilled in respect of live systems, but that the data will remain within the backup environment for a certain period of time until it is overwritten. The key issue is to put the backup data ‘beyond use’, (my emphasis) even if it cannot be immediately overwritten. You must ensure that you do not use the data within the backup for any other purpose, i.e. that the backup is simply held on your systems until it is replaced in line with an established schedule. Provided this is the case it may be unlikely that the retention of personal data within the backup would pose a significant risk, although this will be context specific.”
This appears to suggest that so long as the data in the backup is beyond use and remains inaccessible for any kind of operational purpose or processing, it will be deemed as being low risk of being in breach of the GDPR (although note the caveat “this will be context specific”). Context, for example, could mean considering the security of the backup location, who has access to the backup and whether the data is encrypted. I think this is something that your legal team would need to advise upon.
But in my opinion, backups or archives stored on tape are potentially strong safeguards in respect of putting data ‘beyond use' because data on tape is separated and kept offline from the rest of the network (and from unauthorised persons) by means of an ‘airgap’. Using LTO Ultrium, all data can be encrypted using powerful AES-256 technology, which makes it almost impossible for personal information to be compromised, even if the tape were to be lost or stolen.
To further help businesses understand what it means by “beyond use”, the UK ICO directs audiences to its guidance under the old UK Data Protection Act (1998), which the GDPR has now superceded. At the time of writing, the ICO acknowledges that these guidelines have not been updated to specifically acknowledge the GDPR.
“The following information has not been updated since the Data Protection Act 2018 became law. Although there may be some subtle differences between the guidance in this documentand guidance reflecting the new law – we still consider the information useful to those in the media. This guidance will be updated soon to reflect the changes.”
Although one might think that in directing audiences to its earlier advice, the ICO is signalling it would exercise similar judgement today, I think this is something worth checking with your legal team as there is currently no guarantee that this would be true in every present circumstance - e.g. “this will be context specific”.
But in its guidance for the 1998 Act, the ICO said:
“will be satisfied that information has been ‘put beyond use’, if not actually deleted, provided that the data controller holding it:
- is not able, or will not attempt, to use the personal data to inform any decision in respect of any individual or in a manner that affects the individual in any way;
- does not give any other organisation access to the personal data;
- surrounds the personal data with appropriate technical and organisational security; and
- commits to permanent deletion of the information if, or when, this becomes possible.”
Provided that these four safeguards are in place, the ICO was previously satisfied data would be ‘beyond use’. An encrypted LTO Ultrium tape, held securely in a backup vault with limited and strict access controls does appear to fit the criteria of not needing to be immediately overwritten following a right to be forgotten request under Article 17.
But as I quoted earlier, the ICO still assumes erasure will occur at some point because this ‘beyond use’ exemption requires:
“that the backup is simply held on your systems until it is replaced in line with an established schedule.”
This seems to imply that the data would need to erased eventually, even if this wasn’t to occur in the first 30 days. Whether or not the status of ‘beyond use’ has an expiry date or can be regarded more as ‘for as long as may be necessary’ is not clear.
And in my opinion, the ICO has not yet fully addressed the question of archives because the purpose of an archive, as opposed to a backup, is to retain a primary copy of data for compliance or other commercial requirements. An archive is a collection of records that are kept for long-term retention and used for future reference. Tape is an ideal storage medium for this purpose because magnetic tape is very durable and doesn’t need additional power and cooling to be maintained in an offline state. Typically, information in an archive will be the only copy of that data. So with an archive, you typically would not have an established schedule for overwriting data as the records are intended to be preserved without frequent access or modification.
If your archive data meets the ‘beyond use’ criteria, and is not being used for any other purpose inside your organisation, it may be that the ICO would apply the same threshold in terms of recognising that data may not instantly erased, but may live on for an indeterminate period of time until it can be overwritten. But this is something that you should check with your legal team.
In conclusion, the ICO has gone quite a long way towards explaining the obligations companies may have in respect of Article 17 requests and how these relate to backup and archive data. But we are still waiting for additional clarification (or actual case law) in respect to how the Regulation will be applied in 2018. The signposts to earlier guidance are helpful but they still leave some unanswered questions in my opinion, especially around archive datasets. In the meantime, as I have written elsewhere, LTO Ultrium tape is still a viable storage medium for helping organisations address their GDPR compliance requirements in the round.
Disclaimer: In writing this article, I am only expressing my opinion as an individual who works in storage and has considered the GDPR in the context of backup and archiving. These are my own views and opinions and do not reflect the official position of my employer, Hewlett Packard Enterprise. Please do not regard or use this blog post as a substitute for professional, qualified legal advice.