A Cloud is Only as Good as Its Backup Strategy

Saw this morning where T-Mobile and Microsoft (their subsidiary “Danger”, really) have lost all data stored by users of the Sidekick phone, due to some type of failure with their cloud storage.  We’re starting to see stories like this pop up more and more where so-called clouds fail due to some really simple reasons like not backing up (this instance), not having a redundant data center (the recent Authorize.net issue), or a botched software upgrade (Gmail and Google’s recent issues).Trust is the new currency that really matters in an increasingly distributed and technically delivered world.  Regardless of whether you’re using a “cloud computing” resource or a more traditionally provided computing resource, you’re relying on that provider to fulfill their commitments and think about preventing disasters.Implementing good security, good backup strategies, disaster recover planning…these things are all very difficult.  They are even more difficult when you’re talking about the volumes of data (multiple terrabytes, possibly even petabytes) and users (millions) that clouds are designed for.  The good news is that these are all solvable problems.Here’s what I’d look for in a cloud that I’m using:

  • published disaster recovery policy
  • independent review of some type of outside auditor (SAS70 or an equivalent)
  • published backup strategy
  • multi-homed setup (different geographic regions)
  • a publicly accessible status page that details system health and open issues.

It looks like in this particular instance a botched SAN upgrade might be to blame.  Not having any backups or a mirrored SAN is really troubling, and even more troubling is the fact that this was the case when the design of the phone is such that it requires the cloud to be up for the phone to retrieve and use any of its contacts, pictures, etc.  Cloud computing isn’t the problem here, but it does make for a grabby headline.Read about it here.

Update 10/12/2009:

Apparently there were major issues with employees, morale, and the product that Danger was providing was already two years late.  Read some more anonymous details here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s