5 min read

When bad things happen to good data

Disaster recovery costs corporations on average between $100,000 to $1 million a year for desktop- oriented disasters alone. When you add it up, the market could grow to more than US$71 billion by 2004



To the PhD student who came in desperation to Markham, Ont.-based CBL Data Recovery, it was a corrupt floppy disk containing what had been the only copy of an all-important thesis. To the west coast defence contractor, it was an un-backed up, unbootable RAID array. Magnitude: to the observer, light-years

apart, yet in both cases the data was priceless.

According to ICSA Labs, an independent division of managed security supplier TruSecure Corp., the average company spends between $100,000 and $1 million per year, in hard and soft costs, for desktop-oriented disasters alone. Add to that the cost of server downtime, whose price after 72 hours is often the loss of an entire business (according to U.S. government statistics), and you have a big problem.

IDC expects the worldwide storage market to grow to US$71 billion by 2004. Even with $33 billion of that dedicated to software and services, that still means $38 billion in hardware to fill with usually irreplaceable data.

And that means larger recoveries when things do go wrong. Mike Briand, manager of business development at CBL Data Recovery, says the projects he sees are large now, and getting larger, which presents new challenges to recovery specialists. “”As we get into larger systems, piecing data together presents problems,”” he noted. “”The people in the lab have to work magic to get things back the way they should be.””

Of course, companies need not resort to such drastic measures to retrieve their data. All they need to do is restore a backup — if they have one.

“”It’s an area of best practice,”” said Marco Coulter, vice-president, BrightStor strategy, for Computer Associates. “”We have spent a lot of time in the past few months talking to customers, and telling them ‘Don’t trust your backups.'”” Coulter recommends that every few months, customers randomly pick a backup tape, choose a file on it, and restore it from a randomly chosen device, just to prove that everything is working properly.

Some companies are getting the message, said Fred Dimson, general manager of Veritas Canada. “”We’re seeing (as a result of 9/11) for the first time people are actually testing recoveries. They’re seeing what they can recover, and how long it will take. It’s getting a lot more attention than it did two years ago.””

Jim Lee, vice-president, product marketing at data management software vendor Princeton Softech, agrees. “”This past year has caused many companies to review their plans and actually test them.””

On the other hand, because backups may cut into application availability, “”they may, at times, choose to keep the system up and continue to generate revenue rather than performing the backup. This is a tough decision, and can be costly either way. While analyzing the cost of downtime, every minute is money and this becomes like Russian roulette, do I pay now or pay later?””

When the cylinder spins and the trigger clicks in that roulette game, the result can be costly. Briand says CBL gets a couple of RAID systems for recovery each week. “”It should never happen, but it does,”” he said. “”Frequently the cause is user error. A disk may have failed, and no-one noticed. When the final disk goes down, it gets noticed.””

In the case of the defence contractor rescued by CBL, a combination of human and software errors led to the catastrophe with its RAID array. A software upgrade went awry, and vendor technicians attempting to fix things only made the situation worse.

It took CBL technicians several very long days to put Humpty Dumpty together again, and CBL company president Bill Margeson said the customer learned a lesson. When his team left for home, the newly revived system was in the middle of its first backup.

But while we hear about successes, it’s important to remember that a good percentage of data recovery efforts fail. CBL places its success rate at between 75 and 85 per cent. Another Markham, Ont.-based recovery lab, ActionFront Data Recovery Labs, states in its Data Emergency Guide, “”any company claiming a 90 to 95 per cent success rate is lying. Data recovery can be a complicated process with inherent physical and logistical limitations that determine what can actually be done.””

That’s why backups are critical to protect valuable data. Dimson noted that customers are becoming more religious about their backups. “”Before it was ‘Can I get away without it?’ Now companies have one standard instead of 28 different products in various workgroups. Backup has more recognition in higher areas of companies.””

Added Lee, “”For basic data recovery, backups are used to recover data in the case of an error or corrupted data, regardless if the cause is software, hardware or human related . . . Data recovery from day-to-day operations has always been an important issue. It’s not the importance that has changed, but rather the challenges of performing the daily backups in conjunction with high availability for systems, increased data growth and new data retention requirements. One of the keys to successful data recovery is planning at both the application level as well as the enterprise level.””

But all of the planning in the world won’t ensure proper operations. Even trying to find a way to automate the testing of backups can generate problems — if the randomly selected tape is offsite, for example, the test will fail. And operators frequently do not monitor backup logs as they should, so fail to detect error conditions. “”You always end up with the people factor,”” said Coulter, “”The key to success is a simple tool that can automate things like backups that users don’t want to be experts in. Users want to be experts in their companies’ business.””

At the same time, someone needs the expertise to ensure important data is safely backed up, and resellers are in an ideal position to offer this service. Dimson says in the enterprise space, there are more sophisticated resellers today who can help their customers with backup strategies. In fact, Veritas offers certification which, Dimson says, its resellers view as a value-add for their customers. Still, he is surprised and disturbed to hear stories of vendors who do not configure backup devices into systems.

“”Backup is a necessary evil,”” he stated, “”but if it’s done properly it’s not that onerous. I think the industry as a whole needs to do more in letting people know what they need to back up.””

He acknowledged that setup can be tricky, but sees an opportunity for a knowledgeable VAR and noted, “”as you move up the chain in terms of company size, if you have an automated piece for backup, the amount of human intervention goes down.””

Coulter, too, thinks there’s a huge reseller opportunity in the backup space. “”The whole point of a reseller is that they know the stuff in real terms,”” he said. “”That’s the sort of thing we expect of them. They will go in with a backup product, and an SRM (Storage Resource Management) product, and also give them process, setting them up with best practices. It offers value for the customer, and opportunities for them.””

Lee agrees, adding, “”Resellers in the backup and DR (disaster recovery) market can be a great asset by providing knowledge and information to their customers. The enterprise today is extremely complex, and only getting more so.””

And when all else fails, Briand believes VARs can be an asset in the data recovery realm as well.

“”We have to educate the resellers,”” he said, “”and they will educate their customers. When a (malfunctioning) drive comes in, the reseller should tell the customer that there’s an opportunity to recover the data.””

The reseller would then act as the middleman between the customer and the data recovery lab. Since CBL (and some other labs) have a “”no data, no charge”” policy, said Briand, “”there’s essenti