Blog Post

Achieving Agility and Stability with Big Data Projects

    

A joint discussion with Omer Trajman, CEO and Co-Founder, Rocana, and Dr. Alea Fairchild, Entrepreneur-in-Residence, Blue Hill Research

ALEA:

balance.jpgA recent CIO article stated that enterprises are now being “challenged to get disruptive and be more courageous in their digital vision … and use emerging digital technologies to disrupt and reimagine, rather than optimize current business processes.” But it does not matter how well-designed and well-tested an application is, the first – and often lasting – impressions that users form about that application come from how successfully it is deployed into production. Developers and operations personnel sometimes let unnecessary obstacles take their eyes off the goal of a successful deployment.

I believe that an IT production environment can be both stable and agile at the same time, given the correct implementations! Production acceptance is a methodology used to consistently and successfully deploy application systems into a production environment regardless of platform, and where we are heading is towards platform–agnostic ecosystems. So part of the acceptance process is understanding how users will be interacting with the environment, understanding the purpose of the environment within operations, and planning how the environment fits into the support structure.

Many organizations define “production” differently as it relates to change control requirements, help desk processes, etc. So working with a big data pilot project does mean changing a number of these parameters. In my experience, one challenge with a big data pilot is scaling the pilot “down” to the operational resources available. For example, if the pilot project system was using a somewhat unlimited pool of storage, and in the production environment, the system now needs to work with system-specific attached storage, then the processes will be scaled differently.

How can we deal with the mismatch between a pilot project and its unique resources, and a production environment with its heterogeneous ecosystem and more stratified structure? Omer, is this why more agile pilots like big data projects fail?

OMER:
Most new big data projects fail. Even those employing well-developed technologies often fail spectacularly (e.g., the Obamacare website). In my experience, projects typically fail because teams:

  • don't implement changes required in operating processes
  • don’t recognize that support staff lack operating skills
  • don’t consider operational integration and do it poorly, and
  • don’t plan for sufficient operational oversight.

In short, the lack of operational planning is what most often leads to project failure. To save your project, you need to understand how to overcome these hurdles.

To my first point about Operating Processes, Big Data projects most commonly fail because no one is accountable for identifying and implementing the necessary changes to processes. Consider, for example, the differences in data integration, managing system resources, and running analytics between a traditional data warehouse like Teradata, an analytic database like Vertica, and a big data system like Hadoop. From an underlying technology perspective, the data warehouse and the analytic database are most similar since both follow traditional relational database structures. Yet from a process perspective, moving a forecasting routine from a data warehouse to a big data system is more likely to succeed. This is because the processes for data integration, managing system resources, and running analytics for centralized data management systems, whether a warehouse or a big data system, are similar. However, the processes for provisioning resources, sizing workloads, and monitoring are drastically different for an analytic database. The isolation of data, the control over resources, and the chargeback mechanisms don't map to existing warehouse infrastructure. The technical distinction between these environments means the difference between a successful project and likely failure.

ALEA:
One of the challenges of implementing big data projects is having the right personnel. As the goal of big data is to be able to leverage data to achieve business outcomes,  this idea sometimes gets overshadowed by the tendency to focus on new tools and technologies in implementation rather than why big data is being harnessed. So perhaps one issue is finding employees with an entrepreneurial mindset that can drive agility with big data. Omer, do you see hiring as an issue?

OMER:
Yes, having the right Operational Skills on staff is critical. Whether you hire experienced staff or outsource development, projects that require many new technologies are particularly challenging to run. An analytic database requires a completely different set of skills, while big data infrastructure likely requires few process changes. As long as the systems are similar, IT staff trained in operating centralized data management can readily learn the new processes during pre-production.

For analytic databases, the underlying APIs, the provisioning units, the perceived performance, and the problem remediation tools are completely different from those used to operate a data warehouse. Hiring new staff with the requisite skills is often challenging due to lack of talent. Retraining existing staff is more cost-effective, and reduces the risk that new staff will not fit into the organization. Start training when a project is kicking off, so staff can practice new skills in pre-production.

But success is not just about skills, without Operational Integration mainstream adoption of Big Data projects is doomed. Most skunkworks projects are successful because they are developed and deployed apart from the company infrastructure. The team is free to build new processes, to learn to operate new systems, and to bring in outside expertise to solve problems. Also, the projects do not need to be integrated with the existing operations infrastructure. The risk is that even successful skunkworks projects may be orphaned if they are never integrated with the rest of the infrastructure, and they are the first to be cut when budgets are tightened.

An existing IT operation is like a complex machine, built over time, whose tightly coupled gears run only because they were built to operate together. This machine has no clear blueprint and is impossible to recreate in a lab. Imagine inserting a new widget while the machine is running, and it becomes clear why most projects are rejected when it comes time to integrate. The solution is to develop a clutch. New systems need to develop a cadence of their own, and then gradually spin up (or down) to match the cadence of the rest of the IT machinery. The best recipe for long-term success is to integrate new systems gradually. Operational Oversight Even big data projects that develop new processes, retrain staff, and successfully integrate with existing data management operations are rarely seen as true successes; these projects are considered high-risk, so operational oversight is poor. A new project that no one uses or whose impact is not recognized may run successfully, but never be accepted and acknowledged as critical to the business.

ALEA:
But what about operational maturity? What if the organization is not yet ready for a bimodal IT environment, where agility and maturity run in some form of parallel? The problem with big data pilots as I have seen them in the financial services sphere is that there is a mismatch between the amount of money the business has available and ready to spend on infrastructure, and how quickly the business wants to get an analytical capability ready to produce a business result. Omer, why do you believe that there are project failure issues?

OMER:
Most project leaders don't want to set high expectations or give company leaders much visibility into their projects because they believe their projects may fail. As a result, they don't do the performance measurements required to show company leadership the project's positive impact, and even if leaders know a project is in production, they may not know why it is necessary. New projects are thus the first to go when budgets are cut, not because they have not made a positive impact, but because their impact has not been made clear to decision makers. Thus, IT must consider and socialize the oversight and reporting for a new project up front.

Our Advice to Achieving Agility and Stability with Big Data Projects

Before embarking on new big data projects, or even if you have projects underway, consider their implications. Plan for operations before you start, and make sure you have the processes, skills, integrations, and oversight to get your project into production. Map out the systems the project requires, and ask yourself these questions: - How will these systems function in three or five years? - What new information does the team need to know in order to run them? - Where do these new systems depend on existing infrastructure? - How will I know these systems are meeting business needs? The answers to these questions may provide a better map for getting your project into production and preventing failure.

This blog was written in collaboration with Blue Hill Research and is available on the Blue Hill Research website here. To learn more about Blue Hill Research, visit www.bluehillresearch.com.