When was the last time you’ve ever heard anyone say “IT Applications & Operations”? Frankly, in my 30+ year career in IT, I don’t believe I’ve ever heard anyone use this term. The typical term we hear is IT Infrastructure & Operations. These two go together like Peanut Butter and Jelly, which tells us a lot about how we view the field of IT.
For those that may not be familiar with the role of IT Operations, Joe Hertvik does a great job here of describing IT Operations Management as someone engaged in the role of providing this service to the business. As you can see it’s very interesting how he specifically addresses the gap between responsibilities regarding IT Applications and IT Operations as a Venn diagram in which there is no overlap.
However, as we progress from a pre-Cloud Operating Model world to a post-Cloud Operating Model world, this coupling is changing. As we migrate workloads to the cloud, the first shift will be to operating infrastructure to operating Infrastructure-as-a-Service. This shift will leverage capacity management and monitoring skills used with virtualized environments.
The second shift will come as migrated workloads are refactored into cloud native applications or new cloud native applications are developed specifically for cloud. Here the emphasis of operations will focus on the management and monitoring of the application platform. This may be done using a Platform-as-a-Service offering, such as Azure PaaS, AWS, Cloud Foundry or OpenShift, Software-as-a-Service platforms, such as Salesforce.com or ServiceNow, or even something more traditional based on traditional application servers.
During this second shift, operations will need to focus less on availability and more on consumption patterns. Given the nature of cloud native applications to support greater availability and resilience (see Pets & Cattle presentation by Randy Bias ) through design, emphasis will need to shift toward how services are being consumed in order to determine efficiency and costs management. Greater integration between the operations management platforms and the cloud services will be a critical requirement for this shift to occur.
The final shift will happen as businesses move toward Serverless Computing or Function-as-a-Service. In this model, business logic will execute in response to an event occurring in another service. Due to the temporal nature of this model, operations management and monitoring will change drastically. The application will only be available to monitor for brief periods requiring new techniques for operations in support of the Serverless Computing model. Failures that occur here may only be recognizable by performing analytics on post-execution logs.
The Impact of the Cloud Operating Model Shift
Having presented a view of the world post-Cloud Operating Model, you may ask what’s the impact of this roadmap on today’s traditional IT environment?
As operations is designed today they focus on running the physical and logical environments, which means a significant number of resources are focused on running the devices that run the network, compute and storage in addition to specialized software for monitoring and managing the various components. It also means that the overall budget has had to be divided across managing the physical environment and the applications, with the physical environment usually taking the lion’s share of the budget. This is just the nature of the beast as production applications tend to demonstrate a lower mean-time-to-failure and require less attention than its physical counterparts. Moreover, the physical environment is constrained by procurement cycles and capital expenditure approvals that is not characteristic of the applications.
As shown in the diagram above, as businesses move to away from self-managed infrastructure—it’s not just to cloud, but all infrastructure provided as-a-Service—the focus of IT operations will shift more of its focus to the workloads and the applications. Simultaneously, operational focus shifts away from running a physical environment as this task falls to the infrastructure providers. Hence, what we should expect to see going forward is operations being “unbound” from infrastructure and bonded with applications. As more and more businesses realize the economic benefits of relinquishing ownership of their infrastructure to a provider, IT should reorganize around operations of the applications and workloads.
This unbinding and re-binding process for operations is a key element for successfully implementing a cloud operating model. The results of this activity is that the remaining self-managed infrastructure organization needs to become more encapsulated. It will need fewer resources and should consider combining operational support with infrastructure engineering. This should result in lower operational overhead and significant reduction in IT costs.
This change will also raise every red flag you can imagine for those that have been in infrastructure and operations for the better part of their careers. The fiefdom holders, the server huggers, the CCIEs, vExperts, the tinkerers, the storage priests, etc. will demonstrate strong resistance to this change. They will introduce fear, uncertainty and doubt whenever possible. They will regale stories of cloud failures and breaches. All of this is an attempt to maintain the status quo for as long as possible in light of this disruptive force. It is here that executives must weigh the pros and cons of this change and be the driving force behind moving to this new IT organizational construct.
Additionally, what we’ve seen in many early adopters of cloud is the applications move without its operations counterpart. Eventually, someone asks the question, “where’s my dashboard?” or “how is this integrated into our ITIL processes?” Thus, as we unbind operations from infrastructure and bind it with applications, what is being monitored and how its managed changes, but we carry across, and, hopefully, correct high-latency low-value, processes to support audit, transparency, and corrective action. Some existing operations skills will transfer to this new focus, however, there will be a change in the tooling used for these tasks and, likewise, different skills will be required to operate cloud-based and hybrid applications.
The biggest issue for business will be how to adopt and implement operational management as their workloads shift away from infrastructure and toward applications. There is some early guidance from the application performance management vendors, such as Dynatrace and AppDynamics, and examples from Webscale startups, but as whole, this segment of the industry is unwritten.
We know that key performance indicators that are important today will be different in this post-Cloud Operating Model world. Also, it is very likely that the tools needed to monitor and manage applications in this post-Cloud Operating Model world do not exist or are only starting to now appear in the market. Hence, skilled individuals that understand how to configure these tools most likely don’t yet exist. Thus, the likely outcome will be that businesses will attempt to manage the post-Cloud Operating Model world with the same knowledge and tools they use to manage the pre-Cloud Operating Model world, which will, unfortunately, fail. It is in this failure that I believe the leaders of how to operate a post-Cloud Operating Model world will emerge.
Special thanks to @cpswan and @glenprobinson for their assistance in helping me shape my ideas for this blog