Make the right archiving choice

You need multiple storage technologies because you face multiple data protection challenges!


This article is part 2 in a series looking at the role of tape in the zettabyte era.

In my opinion, and with over twenty years’ experience in the data protection industry, today’s storage market resembles the aviation industry. If all we needed to do was fly from A to B as quickly as possible, only one type of aircraft would ever be required. In reality, there are so many different use cases and applications for air transport that we have developed flying machines of all shapes, sizes and capabilities!

Likewise, the imperatives that businesses must consider to protect or recover mission critical data, are completely different to the requirements they have for preserving less frequently accessed information or using it to make real-time decisions. And their data isn’t static: its volume is increasing and its urgency and value are often variable. This makes it a challenging and amorphous asset to manage efficiently and cost effectively. Data isn’t vapour, after all. It takes up space.

With our digital universe predicted to reach 175 ZB by 2025, a compound annual growth rate of about 40% from today, it’s hard to see how a singular approach towards storage can make sense. If you really need a helicopter, then self-evidently there’s no point deploying a jet fighter. I think the same is still true of tape and a number of alternative storage technologies that have been held up as ‘tape replacements’. And I don’t see this changing over the course of the next decade.

Deep backup isn’t archiving

Perhaps one of the reasons why opponents say tape is no longer relevant is because its primary role has changed. But change is not the same as obsolescence.

Twenty three years ago, when I began my storage career, tape was the de facto standard for backup. In the early days of the internet, the notion of companies having a petabyte or exabyte quantities of digital data was little more than a futurist prediction. But since then, as data centres have changed out of all recognition, their data protection needs have transformed too. Data and access to it - at all times, everywhere - has not only re-imagined how IT services are delivered, but how they are kept safe from disruption. That’s why we have all-flash primary storage and intelligent, de-duplicating secondary backup devices connected to the cloud: these modern backup technologies provide recovery point (RPO) and recovery time (RTO) objectives that tape cannot emulate. But tape still has a trick or two up its sleeve.

If all we needed was backup…

As I said previously, the volume, urgency and value of data are not constant. Value, in particular, diminishes over time, whereas volume, typically increases. It's a bit like our own, everyday lives. We all end up with far more stuff in the attic than we keep in our living room. But we’re usually not so bothered about keeping the loft space tidy because it’s out of sight and typically not needed in a hurry.

So urgency - and in this context, I mean having immediate access to a specific archive file - is generally very ‘point-in-time’ or application dependent. In some cases, it has to be fast, but in many others, it really doesn’t need to be.

Tape still remains a superlative storage medium for retaining vast amounts of infrequently accessed, yet essential data, for a very long time. And by a 'long time', I mean decades if required. Unless you have a compelling need to have data immediately accessible at all times, tape is generally a better alternative when you trade off its advantages in terms of scalability, durability, security (which relate to the volume), cost of ownership (which relates to value) against its disadvantages in terms of speed or time to data (which obviously relates to urgency).

Short term versus long term “archiving”

This touches a vitally important point in the debate. Formerly, there was a distinction to be made between backup and archiving. But in my opinion, this distinction now needs revising to differentiate between (i) primary backup, (ii) short term archiving and (iii) long term archiving.

I described primary backup functions earlier but ‘short term archiving’ is a particularly strong use case for the cloud because it allows businesses to extend RPO by storing backup data on low cost, server-based object storage instead of using more expensive dedicated disk arrays. And it can achieve this with minimal impact upon their RTO. An example of this would be HPE CloudBank for HPE StoreOnce.

Space prevents me from going into all the details of how these HPE disk and cloud solutions complement each other, suffice to say that the benefits of HPE CloudBank (like deep integration with the secondary disk array; instant, on-demand, scale out flexibility; and speed of restoring backup data) would be hard to emulate using tapes in a vault. And that advantage probably holds for months and possibly a number of years if the data remains within the scope of being a target for backup. In a backup application, you may need instant, or near instant, access to random data. And if that is your need, you probably wouldn’t consider using tape in the first instance.

The italics hold the key to this distinction: if the data remains within the scope of being a target for backup. Arguably, this “short term archiving” isn’t really archiving at all, and might be better referred to as “deep backup”. A backup copy, one of many, destined to be overwritten is quite different from an archive copy, which is one of a kind and intended to be kept indefinitely. And as the importance and need for access fades, then based on currently available storage technologies, the pendulum still swings firmly back towards tape. And that statement holds true whether one is comparing tape to either the public cloud (AWS, Azure, Google) or private cloud solutions based on object storage systems.

The reason why I say there is a difference between short term archiving, and the extended periods for which I would consider tape, is similar to the one I was alluding to earlier on with my comparison of helicopters and jet fighters. Too often I see this backup-related use case simply referred to as ‘archiving’ when to my mind it’s a completely different task to the long term preservation of data. Short-term archiving to the cloud isn’t the same thing as long-term archiving to tape, and in my opinion, long-term archiving isn’t a strong use case for the cloud in the majority of cases. Tape still performs this role far better.

The longer your horizon - and businesses are forced by legislation and operational necessity to have very long horizons - the more effective tape becomes for reasons that I will consider in my next article.

Follow or contact us!

Sales Expert | Technical Support