ZapFlash

Garbage in the Cloud

On two occasions I have been asked,—”Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?” … I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

—Charles Babbage, 1864

The long-standing computer science principle of “garbage in, garbage out” (GIGO) is so fundamental to IT that it predates digital computing by almost a century. And yet here we are in the twenty-first century, moving to the Cloud, and Babbage’s exasperated response is no truer or more on point. For not only is the Cloud a magnet for all sorts of garbage, it is also generating new garbage at a brisk clip.

Uploading Garbage to the Cloud

In today’s frantic rush to “move to the Cloud,” too many organizations are failing to ask what they should move to the Cloud. Instead, they envision the Cloud as some kind of huge, nebulous server in the sky, a perfect receptacle for  whatever they have on-premise. Got email? Put it in the Cloud! Got data? Put your data in the Cloud, the bigger the better! Running business processes on-premise? Move them to the Cloud!

Not so fast. Let’s slow down a bit and consider the ramifications of moving too quickly—and haphazardly—to the Cloud.

  • Unclean data – This is the obvious example, pure GIGO. If your current on-premise data are unclean, say you have inconsistent customer demographic information, or obsolete product information, or any other data quality challenge, it goes without saying that moving such information to the Cloud won’t do your data, or your business, any good.  Instead, think of moving your data to the Cloud as though you were moving your elderly parents to a condo. It’s a wonderful excuse to finally dig through the layers of detritus so that you only move data that are clean, accurate, and valuable to the business.
  • Spaghetti code – you may be eyeing that old custom-coded legacy app as a prime Cloud candidate. It’s too slow, it doesn’t scale well, and it’s a bear to integrate now, so won’t the Cloud automatically make it fast, scalable, and easy to integrate? Sorry to burst your bubble. If you’re focusing on an IaaS approach, what you’ll find is that spaghetti code is every bit as intractable in the Cloud as it is on-premise. What about PaaS? Chances are that old code won’t run at all. Today’s PaaS environments expect and enforce a certain level of code quality.
  • Obsolete and Cloud-unfriendly business processes  –  Does this sound familiar? The business asks IT to automate a set of processes, but states unequivocally that “the processes are fine the way they are. Automate them but don’t change them. After all, we’ve been doing things the same way for years. Why change now?” Yes, the business often says that, but seasoned IT veterans have long realized that the business never actually means it. When the business asks IT to touch a process, there is always at least an implied requirement to try to make it better: faster, more streamlined, better aligned with the underlying business need.
    Moving business process implementations to the Cloud raises the stakes in this complex dance between business and technology, because the Cloud offers a wealth of new opportunities for improving processes. Furthermore, how users interact with Cloud-based assets is often fundamentally different from how users interact with traditional enterprise apps. Any organization that has moved from an older CRM app (or no CRM app at all) to Salesforce.com has learned this lesson first hand. But Salesforce is merely a harbinger of greater change to come. One of the main reasons Salesforce has been so successful is because they offer their clients new ways of conducting business—in other words, better processes. Any SaaS solution should build on their example.

Generating New Garbage in the Cloud

The garbage problem doesn’t end with garbage you might put in the Cloud. The Cloud also presents numerous opportunities to generate new kinds of garbage.

  • Zombie instances – it’s so easy and cheap now for anyone in your organization to spawn their own Cloud instances, including virtual machines, storage instances, and more. Furthermore, such instances are elastic: need more of them? The Cloud is only too happy to oblige. But what happens when you’re done with them? You’re supposed to delete them. After all, elasticity works in both directions. All too often, however, instances that have served their purpose are left around like so much space junk. After a while, nobody remembers what they’re for or if they still have something important in them. The last thing you want to do is delete an instance with valuable data or code on it. So to play it safe, you leave it around. Forever. Your Cloud provider is only too happy to keep billing you for these Zombie instances.
  • Data with no provenance – any Antiques Roadshow aficionado knows that antiques with provenance are more valuable than those without. The same goes for your data. Do you know if the data you’re working with are the latest version? Do you know they haven’t been tampered with? If not, then those data are worse than useless, since they may be incorrect, or even worse, keeping them around may violate any number of regulations. Here again the elasticity of the Cloud works against you.
  • Manual or poorly abstracted configurations – Let’s say you’ve built a sophisticated Cloud app based on elastic VM instances. If you need more, simply provision more. But then let’s say some admin somewhere in your IT shop goes into one of these instances and changes a config file in order to get an app to run on that instance. Now you have no way to update your instances without breaking your app—and if that admin didn’t tell anybody about the reconfiguration, then tracking down the problem will present a time-consuming challenge.
    Simply creating a static image file to generate new VM instances—and keeping rogue admins from monkeying with them—won’t solve the problem, because there is more to your app than the instances. Instead, you need a next generation configuration management approach that automates configuration for the Cloud. See Chef or Puppet for an indication where this market is going. (You can expect a ZapFlash on Cloud configuration management in the near future.)

The ZapThink Take

How do you avoid garbage in the Cloud? Architecture is a large part of the answer, of course, but governance is equally important. Organizations should establish and enforce Cloud-centric policies as well as extending current IT governance to the Cloud. With great power comes great responsibility, and the Cloud offers enormous new power to many different roles within the IT organization. The Cloud is fraught with pitfalls. Without sufficient governance, you’re bound to fall in one.

It is also important to note that the issues in this ZapFlash apply equally to private as well as public Clouds. Organizations generally realize that public Clouds present numerous governance challenges, and look to private Clouds because they are ostensibly less risky. But such a stance offers little more than a false sense of security—one that may backfire, if organizations assume that in the absence of proper architecture and governance, a private Cloud is the better choice. Don’t wait to implement adequate Cloud governance until after you’ve run into these problems. Governance should be an integral part of any Cloud strategy, before you move to the Cloud.

 

 

Discussion

3 comments for “Garbage in the Cloud”

  1. Dear Jason,

    Loved your article and I think you have hit the bulls eye. The cloud bandwagon have grown so big that every company wants to have cloud as part of their IT strategy, irrespective of the fact whether they need it or not. And if they need it, what actually they need.

    Very few companies understand this maze. I have seen many IT professionals who call themselves working on cloud platforms but they are yet to understand the simple question “what is cloud”.

    I would like to add one more thing to the Governance model that you talked about. A Cloud Service Provider cannot decide on a single Governance model if its supporting a multiple tenants. Its simply doesn’t work.

    Posted by Partha Dutta | December 14, 2011, 11:40 pm
    • Partha,

      Great comment, and I agree wholeheartedly that it is almost impossible for a cloud provider to create a single governance model for all clients.

      Which is why we find many enterprises wanting to keep their cloud governance tools independent of the providers the use, or could use in the future. This is one of the key reasons I predicted four years ago that the one piece of software that the enterprise will own on-premises, regardless of their public-private-traditional IT balance, will be their cloud management console.

      James Urquhart
      enStratus

      Posted by James Urquhart | December 18, 2011, 3:26 pm
  2. [...] Zapthink’s Jason Bloomberg put a 2012 spin on GIGO, observant how cloud implementations go badly given of a gunk that gets put into them.  Here are a [...]

    Posted by ‘Garbage in, garbage out’ — with a 2012 twist | Datacentre Management . org | December 18, 2011, 5:10 pm

Post a comment

FREE POSTERS

NEW VERSION! ZapThink's Vision for Enterprise IT in 2020
With all new content including Dev/Ops, Hypermedia-Oriented Architecture, Big Data Visualization, and more!
Click here to download for FREE
10-pack of prints for only the cost of shipping!

SOA Implementation Roadmap
Over 100,000 downloaded!
Click here to download for FREE
10-pack of prints for only the cost of shipping!