Continuous Integration – it’s not just for Developers!
By Bob Aiello
Recently, I was having a conversation with a group of colleagues who are some of the smartest people with whom I have had the privilege of working in the discipline of DevOps. This team of amazing people is collaborating together to write the IEEE P2675 DevOps Standard for Building Reliable and Secure Systems Including Application Build, Package and Deployment. We have over one hundred IT experts signed up to help with this project and about thirty who are actively engaged each week with drafting the standard, which will then be ready for review in the next month or two. We are not defining DevOps per se, as lots of folks who like to attend conferences have done a fine job of doing that already. What we are doing is defining how to do DevOps (and to some extent Agile) in organizations that must utilize industry standards and frameworks to adhere to regulatory requirements that are commonly found in industries such as banking, finance, medical, pharmaceutical and defense. I usually describe this effort as explaining how to do DevOps and still pass an audit. One of the topics that keeps coming up in our discussions is whether or not our work is too focused on development and not enough on operations. Recently, one of my colleagues suggested that continuous integration was an example of a key DevOps practice that was entirely focused on development. I disagree completely with this point of view and this article is the first of series explaining how continuous integration, delivery and deployment are fundamental to the work done by any operations team.
DevOps is a set of principles and practices which is intended to help development and operations collaborate and communicate more effectively. We have written articles previously (and in our Agile ALM DevOps book) on DevOps principles and practices. The most fundamental concern for the DevOps practitioner is to understand that different stakeholders hold diverse views and often bring to the table varying types of expertise. DevOps is all about ensuring that we all share our knowledge and help each team perform more effectively. In organizations which prioritize DevOps excellence, teams are constantly sharing their screens and sharing their knowledge and expertise. Ops is a key player in this effort and often “the adult in the room” who understands how the application is actually going to perform in a production environment with a production workload.
In continuous integration (CI), developers are constantly integrating their code (often by merging source code and components) ensuring that the code that they write can work together and a bad commit does not break the build. CI is a seminal process for any DevOps function, and very few folks would think that you are doing DevOps if you did not implement continuous integration. To demonstrate that Ops does indeed do continuous integration, I could use the low-hanging fruit of describing how infrastructure as code (IaC) is written using Terraform or AWS CloudFormation and, no doubt, such efforts are a valid example of systems engineers who need to integrate their code continuously. But there are more dramatic examples of Ops using continuous integration which present greater challenges. For example, operations engineers often have to manage patching servers at the operating system, middleware and applications levels. Hardware changes may also have to be integrated into the systems which can have far reaching impact, including managing changes to storage and networking infrastructure. Configuration changes, from firewalls to communication protocols, also have to be managed – and continuously integrated. These changes are coming from engineers in different groups, often working in silos and are every bit as complicated as merging code between developers. The work also has to be done using a full systems lifecycle which is often sadly overlooked.
Let’s take a look at how we might implement a systems lifecycle in support of continuous integration for Ops. We have been in many conversations where we are told that patches must be applied to a server to ensure compliance with security and regulatory requirements. We don’t doubt the importance of such efforts, but we often see that there is a lack of discipline and due diligence in how patches are deployed to production servers. The first step should always be to obtain a description of the patches (e.g. release notes) that are going to be applied and have them reviewed by the folks who are able to understand the potential downstream impact of the required patches. Sometimes, patches require changes to the code which may necessitate a full development lifecycle. These changes always require due diligence including both functional and non-functional testing. Knowing what to test should start with understanding what the patch is actually changing. Too often, the folks who downloaded the code for the patch neglect to also download and circulate the release notes describing what the patch is actually changing. (Trust me its right there on the website right next to where you downloaded the patch.) Once you have ensured that the release notes for the patch have been reviewed by the right stakeholders (often including both developers and operations systems engineers), then you are in a much better position to work with your QA and testing experts on an appropriate strategy to test those patches (don’t forget load testing). Obviously, you want to promote a patch through a systems lifecycle just like you would promote any piece of code, so start with patching the Dev servers and work your way up to UAT and then schedule the change window for updating the production servers.
Keep in mind that these patches are often coming at different levels of the system, including the operating system (perhaps low level), middleware and then application frameworks. You probably need to consider related configuration changes that are required and you may need to coordinate these changes with upgrading your storage or networking system. You’ll want to take an iterative approach and continuously integrate these changes – just as if you were integrating application changes. Remember in DevOps we take a wide systems view and we also ensure that all of the smart people around us are well informed and have an opportunity to share their knowledge and expertise. What other examples of Ops practicing continuous integration can you think of? Drop me a line and share your views!