by Bob Aiello
DevOps is a set of principles and practices that helps development and operations teams work more effectively together. DevOps focuses on improving communication and collaboration with the technical goal of providing reliable and flexible release and deployment automation. Much of what is written about DevOps is delivered from the perspective of developers. We see many articles about continuous, delivery and deployment describing the virtues of good tooling, microservices and very popular practices around virtualization and containers. In my opinion, DevOps needs to also take an operations view in order to achieve the right balance. This article is all about putting Operations back into DevOps.
Operations professionals are responsible for ensuring that IT services are available without interruption or even degradation in services. IT operations is a tough job and I have worked with many technology professionals who were truly gifted in IT operations with all of its functions and competencies. Many IT operations staff perform essential day to day operations tasks that can be very repetitive, although essential in keeping critical systems online and operational. In some organizations, operations engineers are not as highly skilled as their development counterparts. Historically, mainframe operators were focused on punch cards and mounting tapes while programmers were focused on implementing complex business logic. Today we do come across operations engineers who lack strong software engineering skills and training and this can be a very serious problem. When developers observe that operations technicians are not highly skilled then they often stop providing technical information because the developers come to the conclusion that the operations folks cannot understand the technical details. This dynamic can result in consequences that are disastrous for the company, with the most common challenge of developers feeling that should try to bypass operations as often as possible.I have also worked with top notch Unix/Linux gurus in operations who focused on keeping complex systems up and running on a continuous basis. IT operations professionals often embrace the itSMF ITIL v3 framework to ensure that they are implementing industry best practices that ensure reliable IT services. If you are not already aware of ITIL v3 you probably should be.
The ITIL v3 framework describes a robust set of industry best practices designed to ensure continuous operation of IT services. The ISACA Cobit and the SEI CMMI are also frameworks that are used by many organizations to improve their IT processes, but ITIL is by far the popular set of guidelines for IT operations. CM professionals should particularly focus on the guidance in the transition section of the ITIL framework which describes change management, build and release, configuration management systems (including CMDB). With all of this guidance do not forget to start at the beginning with an understanding of the application and systems architecture.
The first thing that I always require is a clear description of the application and systems architecture. This information is very important to have a clear understanding of the system as a whole or in DevOps terminology, having a full end-to-end systems view. For build and release engineers, understanding the architecture is fundamental because all of our build, release and deployment scripts must be created with an understanding of the architecture involved. In fact, development needs to build applications that are designed for IT Operations.
Many developers focus on Test Driven Development (TDD) where code is designed and written to be testable, often beginning with writing the unit test classes even before the application code itself is written. I have run several large scale automated testing projects in my career and I have always tried to work with the developers to design the systems to be more easily testable. In some cases this actually included hooks to ensure that the test tools could work without finding too many cosmetic superficial issues (which we usually call false positives). Test Driven Development is very effective and it is my view that applications also need to be designed and written with operations in mind. One reason to design applications with IT Operations in mind is to enable the implementation of IT process automation.
Effective IT operations teams rely upon tools including the automated collection of events, alerts and incident management. When an alert is raised or incident reported to the IT Service Desk, the IT Operations team must be able to rely upon IT process automation to facilitate detection and resolution of the incident, preferably before there is customer impact. IT Process automation must include automated workflows to enable each member of the team to respond in a clear and consistent way. In practice, it is very common for organizations to have one or two highly skilled subject matter experts who are able to troubleshoot almost any production issue. The problem is that these folks don’t always work twenty four hours a day – seven days a week and in fact, are usually on vacation when problems occur. IT process automation, including workflow automation, enables the operations team to have well documented and repeatable processes to help ensure that IT Services are consistently working in a reliable and consistent way. Getting these procedures right must always start with the application build.
Effective build automation often includes key procedures such as embedding immutable version IDs into configuration items to facilitate the physical configuration audit. For example, a C#/.net application should have a version identifier embedded into the assembly. You can embed version IDs via an MS Build script or using Visual Studio IDE. The Microsoft .net MSIL Disassembler (Ildasm.exe) can be used to look inside of a .net assembly and display the version ID. There are similar techniques in Java/C/C++ along with almost every other software development technology. These techniques are essential for IT operations to be able to confirm that the correct binary configuration items are in place and that there have not been any unauthorized changes. Builds are important, but continuously deploying code starting from very early in the development lifecycle is also a critical DevOps function that helps IT operations to be more effective.
Application automation is a key competency in any effective DevOps environment. Continuous Delivery enables the IT operations team to rehearse and streamline the entire deployment process. If this is done right then the operations team can support many deployments while still maintaining a high level of service and support. The best practice is to move the application build, package and deployment process upstream and begin the effort with supporting development test environments. These automated procedures are not trivial and it will take some time to get them right. The sooner in the lifecycle you begin this effort, the sooner your procedures will be mature and reliable. Since organizations have to pay someone to deploy code to the development and testing environments, it is a great idea to have the person who will deploy to production do this work and get the experience and training to understand and help evolve the deployment automation. The practice of involving operations from the beginning of the lifecycle has become known as left-shift. IT operations depends upon a reliable automated deployment framework for success and getting Ops involved from the beginning helps you get that work done.
IT operations is a key stakeholder in any DevOps transformation. It is all too common for development to miss the importance of partnering effectively with operations to develop effective procedures that ensure uninterrupted IT services. If you want to excel then you need to keep Operations in your DevOps transformation!