Call for Participation IEEE P828 – Configuration Management


Call for participation in the IEEE P828 Standard for Configuration Management in Systems and Software Engineering

This standard establishes the minimum requirements for Configuration Management (CM) in Systems and Software Engineering, without restriction to any form, class, or type.

This standard describes CM processes to be established, how they are to be accomplished, who is responsible for doing specific activities, when they are to happen, and what specific resources are required. It addresses CM activities over a product’s life cycle. This standard is consistent with IEEE’s Software Engineering Body of Knowledge (SWEBOK), ISO/IEC/IEEE 12207 and ISO/IEC/IEEE 15288.

Need for the Project: We currently have only one active standard for CM (IEEE 828-2012) The current standard establishes the minimum requirements for Configuration Management (CM) in Systems and Software Engineering.

A revision of 828 needs to be consistent with the CM process as described in 15288 and 12207. The revised 828 should be consistent with the ISO/IEC 19770 IT Asset Management series and should answer the question of what CM data to track for what types of object. The revised 828 should recognize differences between hardware and software CM and could also discuss CM of services and should work with open source and current tools and release practices. The revised standard must be usable with any life cycle model and including Agile, Lean and iterative waterfall and the revised 828 must be aligned with IEEE P2675 – IEEE Standard for DevOps: Building Reliable and Secure Systems Including Application Build, Package and Deployment as well as related SC7 standards.

The Standards Creation Lifecycle


The Standards Creation Lifecycle
by Bob Aiello.

I have written about my personal experiences being involved with the amazingly collaborative process of writing an industry standard. This article will specify the steps that are commonly followed in creating a draft of an industry standard. We will be following this sort of lifecycle as we work on creating an industry standard for DevOps.

The IEEE lists the following stages in the standards lifecycle:

  1. Initiating a Project
  2. Mobilizing a Working Group
  3. Drafting a Standard
  4. Balloting a Standard
  5. Approving a Standard
  6. Maintaining a Standard

For this article, our focus will be on step 3, which is the process to draft a standard. Creating a draft usually involves the following steps:

1. Decide on the initial scope and focus.
2. Create an initial outline, which will likely change several times over the drafting process.
3. Identify similar standards and frameworks, which will be reviewed by the team.
4. Assign areas for each member of the working group to research and then share with the working group. (The working group is self-teaching and knowledge based. Additional SMEs outside of the working may be asked to present material to educate the team).
5. The initial outline is revisited several times and updated.
6. Sections of the draft are assigned to volunteers, sometimes in pairs, who go off and draft the initial language.
7. An editor consolidates input and creates the initial draft, which is reviewed in a series of sessions. This step is often quite time consuming.
8. The initial draft is sent to a few subject matter experts outside of the working group for review and comment.
9. Feedback from the reviewers is incorporated into the draft standard and sent out to a wider audience. This process is iterative and the team may find themselves reworking major sections of the standard, and in rare cases, even the scope of the standard itself.
10. Finally, the initial draft is complete and sent out for ballot and comments may still come in from the wider balloting community requiring additional updates.

Above all, the standards creation process is collaborative. There are often many different views and perspectives. The process is also very transparent. Differing views may be expressed and they should be tracked in a document. There should be a documented rationale for rejecting a suggested change or edit to the draft standard. Every effort should be made to harmonize to related industry standards and frameworks. In practice, decisions are made to move the standard along.

The working group chair ensures that the dialog is constructive. If there are personality conflicts or (less than amiable) agendas it is possible that members will be separated from the group. This happens most often due to personality clashes, but can also be due to a vendor trying to hijack the standard to promote his business or product. Vendors are encouraged to participate as they often have highly experienced subject matter experts and it is in their best interests to ensure that industry standards are robust and viable. However, standards must be vendor neutral in every way.

Industry standards are not perfect and there are specific reasons for why they may fall short of expectations. But the process of creating an industry standard is actually pretty robust. I hope that you will consider joining our effort to create an industry standard for DevOps!

Bob Aiello

IEEE SRE Standards Working group

Are you interested in being a part of the future of Site Reliability Engineering (SRE)? Look no further than the IEEE P2675 DevOps working group! We are beginning work on an SRE standard and are seeking individuals to join this exciting project. 

The working group will review industry definitions and views on what SRE means, and especially how it is implemented in highly regulated industries including banking, finance, medical, pharmaceutical, engineering, automotive, aerospace and defense. We will consider what SRE really has to do with reliability, availability, and related terms, and how an SRE standard would align with other industry standards including the ISO 32675 DevOps standard, The IEEE 828 Configuration Management standard, along with many other IEEE/ISO/EIA industry standards.

Should SRE really refer to Systems/Site/Services Reliability Engineering?

SRE thought leaders emphasize the importance of automating tasks often performed manually by operations engineers, and we know that SRE engineers spend a considerable amount of time writing code to identify and address issues before they impact end users. 

If you are excited about the SRE topic and would like to join the award-winning IEEE P2675 DevOps working group, please contact Bob Aiello directly at Let’s shape the future of SRE together! 

Personality Matters—CM Excellence

Personality Matters—CM Excellence By Leslie Sachs Excellence in Configuration Management conjures up images of efficient processes with fully automated procedures to build, package and deploy applications resulting in happy users enjoying new features on a regular basis. The fact is that continuous integration servers and great tools alone do not make CM excellence. The most important resources are the technology professionals on our team and the leader who helps guide the team towards excellence. That doesn’t mean that everyone is (or can be) a top performer, even if you are blessed with working on a high performance cross-functional team. What it does mean is that understanding the factors that lead to excellence will enable you to assess and improve your own performance. A good place to start is to consider the traits commonly associated with effective leadership and successful results. This article will help you understand what psychologists have learned regarding some of the essential qualities found among top leaders and others who consistently achieve excellence! Software Process Improvement (SPI) is all about identifying potential areas of improvement. Achieving excellence depends upon your ability to identify and improve upon your own behavior and effectiveness. It is well-known that we are each born with specific personality traits and innate dispositional tendencies. However, it is an equally well-established fact that we can significantly modify this endowment if we understand our own natural tendencies and then develop complementary behaviors that will help us achieve success! Let’s start by considering some of the personality traits that help predict effective leadership. One of the first studies on effective leadership was conducted by psychologist Ralph M. Stogdill [1]. This early study identified a group of traits including intelligence, alertness, insight, responsibility, initiative, persistence, self-confidence, and sociability. It is certainly not surprising that these specific traits are valuable for successful leaders and achieving excellence. Being intelligent speaks for itself and of being alert (for new opportunities) and having insight into the deeper meaning of each situation and opportunity. Although general intelligence was for a long time considered static, more recent research suggests that it is possible to bolster one’s genetic inheritance. Certainly, one can consciously strive to develop the behavioral patterns, such as attentiveness to detail and novelty and thoughtful analysis of options, closely associated with intelligence. You might want to ask yourself whether or not you are responsible, take initiative and show persistence when faced with difficult challenges. Displaying self-confidence and operating amiably within a social structure is essential as well. Do you appreciate why these characteristics are essential for leadership and achieving success and look for opportunities to incorporate these qualities into your personality? To improve your leadership profile, you must also actively demonstrate that you know how to apply these valuable traits to solve real workplace dilemmas. Upon reflection, you can certainly see why CM excellence would come from CM professionals who are intelligent, alert and insightful. Being responsible, showing initiative and being persistent along with having self-confidence and being a social being are all clearly desirable personality traits which lead to behaviors that result in CM excellence. Stogdill conducted a second survey in which he identified ten traits which are essential for effective leadership. This expanded cluster includes drive for responsibility, persistence in pursuit of goals, risk-taking and problem-solving capabilities, self-confidence, and a drive for taking initiative. In addition to these, Stogdill also discovered that people’s ability to manage stress (such as frustration and delay) as well as their accountability for the consequences of their actions are both integral to leadership success. After all, intelligence, insight, and sharp analytic skills are not very useful if a manager is too stressed out to prioritize efficiently or authorize appropriate team members to implement essential programs Finally, you need to be able to influence other people’s behavior and to handle social interactions. [2] Other noted psychologists have also studied leadership traits and many have identified similar traits. Jurgen Appelo lists 50 team virtues [3] which, not surprisingly, also correspond to many of the traits identified in Stogdill’s studies and I discussed many of these same traits in the book that I coauthored with Bob Aiello [4]. You need to consider each of these traits and understand why they are essential to leadership and achieving success. Keep in mind that while people are born with natural tendencies, each of us is capable of stretching beyond them if we understand our own personality and identify which behaviors are most likely to lead to the changes we desire. So if you want to achieve greater success, consider reflecting upon your own behaviors and comparing your style with those traits that have been associated repeatedly with good leaders and CM excellence. For example, being proactive in solving problems and having the self-confidence to take appropriate risks can help you achieve success. Remember also that being social means that you involve and interact with your entire team- full team participation maximizes the power of each member’s strengths while minimizing the impact of individual weaknesses. Configuration Management excellence depends upon the skilled resources who handle the complex IT demands on a day-to-day basis. The most successful professionals are able to take stock of their personality and consider the traits that experts regard as essential for effective leadership. If you can develop this self-awareness, you can achieve success by developing the behaviors that result in strong leadership and excellence in all you do! References: [1] Yukl, Gary, Leadership in Organizations, Prentice Hall, 1981, p. 237 [2] Northouse, Peter G., Introduction to Leadership Concepts and Practice Second Edition, SAGE Publications, Inc 2012, p. 17 [3] Appelo, Jurgen, Management 3.0 – Leading Agile Developers, Developing Agile Leaders. Addison-Wesley, 2011, p. 93 [4] Aiello, Robert and Leslie Sachs. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley, 2010

Behaviorally Speaking—CM, ALM & DevOps Strategies

Behaviorally Speaking—CM, ALM & DevOps Strategies by Bob Aiello Configuration Management (CM), Application lifecycle management (ALM) and DevOps are not easy to implement. In our consulting practice, we develop and implement strategies to support CM, ALM and DevOps in many organizations and the truth is that we are not always satisfied with the results (and sometimes neither is the management who brought us in). We have also been very successful, and largely because we came up with a strategy that made sense for the organization where we were implementing these practices. Coming up the right approach is not always easy and we’ve learned a few lessons along the way that we’d would like to share with you in this article. What is CM? The Definition of CM is a topic that has triggered many an enthusiastic debate in the old CM Crossroads forums and groups. (Make sure that you jump in and participate right here by registering for an account!) Traditional CM experts will typically answer that CM is: Configuration Identification Status Accounting Change Control Configuration Audit This is certainly true, but it can be very difficult to come up with a strategy to implement these practices in a large multi-platform organization. I have presented my own framework for understanding CM in terms of six core functions that I believe more closely represent the way in which CM is actually practiced on a day-to-day basis. [1] The six functions are: Source Code Management Build Engineering Environment Management Change Management Release Management Deployment We have many years of experience implementing each of these six functions in large enterprise-wide environments. The first thing that I always focus on is making CM compelling. You need to start by demonstrating how CM can help your organization (especially your development team) create the software (or what CM gurus call configuration items). Don’t expect everyone to just automatically accept (and believe in) the benefits of good CM. But over time, we have seen many folks come to the conclusion that CM practices make sense. Application Lifecycle Management (ALM) may be a little harder to fully grasp. The key to understanding ALM is to first understand the classic CM function called Status Accounting. Status Accounting Status Accounting involves following the evolution of a configuration item throughout its lifecycle. The terminology is a little confusing and you are certainly not accounting in the sense of counting rows of numbers. Instead, you are tracking the status of the configuration items that are being created during the development lifecycle. This sounds great now, but then how do you go about doing status accounting? On a practical basis this is exactly what ALM is all about. Application Lifecycle Management (ALM) The ALM is essentially the software or systems lifecycle used to create each of your configuration items. This means that CM is (and always) was a full lifecycle endeavor. So another aspect of your strategy has to be to realize that CM and ALM are focused on the entire lifecycle. So then what exactly makes ALM different than the CM function of Status Accounting? In practice, ALM has a very wide scope from tracking requirements to design, test planning, development, all the way to deployment (and even tracking and retiring configuration items that should no longer be in use). Obviously, CM touches many of the same points as requirements, test cases and design documents all need to be under version control as well. ALM also places a strong focus on tools and tools integration. ALM Tools Recently, a colleague of mine chanted the common view that the process is much more important than the tools. I used to believe this, but ALM has taught me that tools do matter a great deal and your strategy needs to include a robust tools selection process (e.g. bake off). ALM’s wide focus[2] means that you need to have the right tools (and process) in place to support every aspect of your software and systems development effort. In practice, this has meant implementing requirements tracking tools with integration to test case management systems. There are two essential reasons for this. The first is that requirements should map to test cases (you want to verify and validate your requirements – don’t you?) and the second reason is that incomplete requirements can be supported by well documented test cases. ALM tools need to focus on integrations to provide a complete and robust full lifecycle solution. One benefit of this approach is enhanced IT Governance and Compliance.   IT Governance and Compliance IT Governance is all about providing the essential information that management needs to make the right decisions. These practices are an essential ingredient is seeing the management can make the best decisions with the input of information that is accurate and relevant. There are a number of organizations that provide information on implementing IT Governance including ISACA. Your strategy should include using industry standards and frameworks for guidance. This leads directly to Compliance which usually refers to complying with regulatory requirements. CM and ALM are essential for supporting both IT Governance and Compliance. Agile and Lean for CM/ALM Strategy One of the best strategies that I have implemented was using Agile and Lean practices to iteratively develop CM & ALM processes. For example, one ALM solution that I implemented recently has a complex workflow automation template that can be tailored to the individual needs of the team and project. There was no viable choice but to approach this effort in an iterative way. So I actually setup a separate project in the tool to track changes to the workflow automation template itself. I used the ALM tool to develop and implement the ALM tool! Make sure that you realize that implementing CM and ALM is an effort that requires industry best practices in alignment with your organization and culture. DevOps consists of principles and practices that help improve communication and collaboration between teams that all too often have very different goals and objectives. While the principles are consistent across all projects, the practices may indeed vary from one situation to another. Read more about DevOps here. Conclusion CM, ALM and DevOps are essential for the success of any software or systems development effort. There are many lessons learned and effective strategies for success. Obviously, there are also risks that need to be addressed. The best strategy that I have found is to use the same principles that work for your development effort to guide and management the implementation of your CM and ALM functions. This is an excellent strategy that will help facilitate your success and the success of all of your efforts. Make sure that you drop me a line and share your strategies for CM, ALM and DevOps! [1] Aiello, Robert and Leslie Sachs. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley, 2010. [2] Aiello, Bob and Leslie Sachs. 2016. Agile Application Lifecycle Management: Using DevOps to Drive Process Improvement. Addison-Wesley Professional

Behaviorally Speaking – CM and Traceability

Behaviorally Speaking – CM and Traceability by Bob Aiello Software and systems development can often be a very complex endeavor so it is no wonder that sometimes important details can get lost in the process. My own work involves implementing CM tools and processes across many technology platforms including mainframes, Linux/Unix, and Windows. I may be implementing an enterprise Application Lifecycle Management (ALM) solution one day and supporting an open source version control system (VCS) the next. It can be difficult to remember all of the details of the implementation and yet that is precisely what I need to do. The only way to ensure that I don’t lose track of my own changes is to use the very same tools and processes, that I teach developers, for my own implementation work – thereby ensuring history and traceability of everything that I do. I have known lots of very smart developers who made mistakes due to forgetting details that should have been documented and stored for future reference. It often seems like developers are great at running up the mountain the first time, but it takes process and repeatable procedures to ensure that each and every member of the team can run up the same mountain with tripping. This article will discuss how to implement CM and traceability in a practical and realistic way. The most basic form of traceability is establishing baselines to record a specific milestone in time. For example, when you are checking changes into a version control tool, there is always a point in which you believe that all of the changes are complete. To record this baseline you should label or tag the version control repository at that point in time. This is basic version control and essential in order to be able to rebuild a release at a specific point in time (usually when the code was released to QA for testing). But how do you maintain traceability when the code has been deployed and is no longer in the version control solution? In fact, you need to also maintain baselines in the production runtime area and ensure that you can verify that the right code has been deployed. You also must ensure that unauthorized changes have not occurred either through malicious intent or just an honest mistake. Maintaining a baseline in a runtime environment takes a little more effort than just labeling the source code in a version control repository because you need to actually verify the correct binaries and other runtime dependencies have been deployed and have not been modified without authorization. It is also sometimes necessary that you find the exact version of the source code that was used to build the release that is running in production in order to make a support change such as an emergency bug fix. Best practices in version control and deployment engineering are very important but there is also more to traceability than just labeling source code and tracking binaries. When software is being developed it is important to develop the requirements with enough detail so that the developers are able to understand the exact functionality that needs to be developed. Requirements themselves change frequently and it is essential that you can track and version control requirements themselves. In many industries such as medical and other mission critical systems, there is often a regulatory requirement to ensure that all requirements have been reviewed and were included in the release. If you were developing a life support system then obviously requirements tracking could be a matter of life or death. Dropping an essential requirement for a high speed trading firm can also result in serious consequences and it is just not feasible to rely upon testing to ensure that all requirements have been met. As Deming noted many years ago, quality has to be built in from the beginning [1]. There are also times when all requirements cannot be included in the release and choices have to be made often based upon risk and the return on investment for including a specific feature. This is when, it is often essential to know who requested the specific requirement and also who has the authority to decide on which requirements will be approved and delivered. Traceability is essential in these circumstances. Robust version control solutions allow you to automatically track the sets of changes, known as changesets, to the workitems that described and authorized the change. Tracking workitems to changesets is known by some authors as task based development [2]. In task based development, you define the workitems up front and then assign them to resources to complete the work. Workitems may be defects, tasks, requirements or for agile enthusiasts, epics and stories. Some tools allow you to specify linkages between workitems such as a parent-child relationship. This is very handy when you have a defect come in from the help desk that results in other workitems such as tasks and even test cases to ensure that the problem does not happen again in the next release. Traceability helps to document these relationships and also link the workitems to the changesets themselves. Establishing traceability does not really take much more time and it does help to organize and implement iterative development. In fact, it is much easier to plan and implement agile scrum based development if your version control tool implements task based development with the added benefit of providing traceability. Traceability can help you and your team manage the entire CM process by organizing and tracking the essential details that must be completed in order to successfully deliver the features that your customers want to see. Picking the right tools and processes can help you implement effective CM and maintain much needed traceability. It can also help you develop software that has better quality while meeting those challenging deadlines which often change due to unforeseen circumstances. Use traceability effectively to accelerate your agile development! [1] Deming, W. Edwards (1986). Out of the Crisis. MIT Press [2] Hüttermann, Michael, Agile ALM: Lightweight tools and Agile strategies, Manning Publications 2011 [3] Aiello, Bob and Leslie Sachs. 2011. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley Professional [4] Aiello, Bob and Leslie Sachs. 2016. Agile Application Lifecycle Management: Using DevOps to Drive Process Improvement. Addison-Wesley Professional  

Call for Articles!

Hi Everyone! I am excited to invite you to get involved with the Agile ALM Journal by contributing your own articles on Agile ALM and DevOps along with all aspects of software and systems development. The Agile ALM Journal provides guidance on Application Lifecycle Management which means that we have a strong focus on DevOps, Configuration Management and software methodology throughout the entire ALM. Articles are typically 900 – 1200 words and should explain how to do some aspect of software methodology. Contact me directly to get involved with submitting your articles and I will help you with getting started, forming your ideas and editing your article for publication. Common topics include:
  • Software development approaches including agile
  • DevOps throughout the entire software process
  • Configuration Management (including the CMDB)
  • Build and Release Engineering
  • Source Code Management including branching and streams
  • Deployment Engineering (DevOps)
  • Continuous Testing
  • Development in the Cloud
  • Continuous Integration and Deployment
  • Environment Management
  • Change Management
and much more! Bob Aiello Editor

So How Are Industry Standards Created?

So How Are Industry Standards Created?
By Bob Aiello

Industry standards and frameworks provide the structure and guidance to help ensure that your processes and procedures meet the requirements for audit and regulatory compliance. For US-based firms, this may involve passing your SOX audit (for compliance with section 404 of the Sarbanes-Oxley Act of 2002) or acquiring the highly respected ISO 9000 Quality Management System certification expected by many customers throughout the world.

Industry standards are not perfect and some of the specific reasons for why they may fall short of expectations can be traced back to how they were initially created. The process of creating an industry standard is actually quite deliberate and time-consuming.. There are some excellent resources from the IEEE and other standards bodies, which describe the process to draft and implement standards. But I would like to describe some of my own personal experiences participating in the collaboration and teamwork of creating an industry standard. Working closely with other colleagues who are dedicated to excellence has been far and away the most exciting professional experience that I have been fortunate to have.

Please note that this is not an official IEEE article, but rather Bob’s recounting of a personal experience being involved with creating industry standards.

The first step is always to decide on the initial scope and focus of the standard. We then review any existing resources available – including related industry standards and frameworks or simply documents which can help educate the members of the team involved with this effort. The standards working group is a high-performance self-managing self-educating cross-functional team with subject matter experts from a variety of disciplines and perspectives. We do not always agree with each other and, in fact, the discussions can be quite confrontational at times – although always professional and collaborative. These disagreements are a natural expression of the group’s striving to come up with the best approach to advocate in the text of the standard.

We create an initial outline and list of topics to consider and then address the task of creating a working draft. The focus is on “shall” statements which are mandatory (for compliance with the standard) and “should” statements which are recommended. We hold numerous sessions to collaboratively create the initial draft. It is common to assign specific sections to individual members (or subgroups) who then go off and independently create the initial draft wording.

Once the draft is written, it is sent to a few SMEs outside of the working group for their reaction and comment. Once this feedback is evaluated and incorporated, the draft is sent out to a wider group for review and comment and, once again, feedback is incorporated. The objective is to have validation that it is a solid document before it is put out for a vote.

Above all else, the standards creation process is collaborative and transparent. Typically, contributor’s comments are recorded and the reason for their acceptance or rejection documented. We have a strong desire to ensure that the draft standard is aligned with other industry standards and frameworks and do our utmost to harmonize with the current guidance provided by other sources. Final decisions are made and sometimes folks are not happy, but they know that their views are always heard and, most often, recorded for traceability. It is customary for a standard to require a significant percentage of voter approvals for passage and acceptance by the standards body. On occasion, controversial paragraphs have to be dropped in order to obtain the required votes for approval, similar to the negotiations, aka “horse-trading”, for which politics is known. Although such modifications felt to me personally like we were “watering” down the standard just to gain the required consensus, the teams focus and mission is to produce a clear document that will be both respected and adopted.

Over the years, I have written extensively on how to comply with configuration management related standards, including the highly popular IEEE 828 (which I had the privilege to participate in updating). Lots of folks like to criticize standards, but often they are criticizing a document that they have never actually spent the time to read – let alone understand or see implemented effectively.

It has been my personal experience that implementation of a standard requires two key skills. The first is harmonizing the guidance by understanding similar industry standards and frameworks. The second is tailoring, in which we provide a rationale for why specific guidance cannot be followed, if this is in fact necessary.

Here’s your opportunity! We are starting up an effort to create a working group to write an industry standard for DevOps. Please consider getting involved now to help shape the guidance that we provide. Rest assured that I will continue writing about this exciting project in the coming months and your voice is important to us!

Bob Aiello

The IEEE P2675 Standard for DevOps: Building Reliable and Secure Systems Including Application Build, Package and Deployment

IEEE P2675 – IEEE Standard for DevOps: Building Reliable and Secure Systems Including Application Build, Package and Deployment is now available for purchase from the IEEE.

This 95 page document specifies technical principles and practices to build, package and deploy systems and applications in a reliable and secure way. The standard focuses on establishing effective compliance and IT controls. It presents principles of DevOps including mission first, customer focus, left-shift, continuous everything, and systems thinking. It describes how stakeholders including developers and operations staff can collaborate and communicate effectively. Its process outcomes and activities are aligned with the process model specified in ISO/IEC/IEEE 12207, Systems and software engineering – Software life cycle processes, and in ISO/IEC/IEEE 15288, Systems and software engineering – System life cycle processes.

CM Best Practices: Practical Methods that Work in the Real World

Based upon Bob Aiello’s book on CM Best Practices, this video covers the core CM Best Practices of source code management, build engineering, change control, environment management, release engineering and deployment automation. Accelerate your software development with CM Best Practices!

Continuous Integration – It’s Not Just for Developers!

Continuous Integration – it’s not just for Developers! By Bob Aiello Recently, I was having a conversation with a group of colleagues who are some of the smartest people with whom I have had the privilege of working in the discipline of DevOps. This team of amazing people is collaborating together to write the IEEE P2675 DevOps Standard for Building Reliable and Secure Systems Including Application Build, Package and Deployment. We have over one hundred IT experts signed up to help with this project and about thirty who are actively engaged each week with drafting the standard, which will then be ready for review in the next month or two. We are not defining DevOps per se, as lots of folks who like to attend conferences have done a fine job of doing that already. What we are doing is defining how to do DevOps (and to some extent Agile) in organizations that must utilize industry standards and frameworks to adhere to regulatory requirements that are commonly found in industries such as banking, finance, medical, pharmaceutical and defense. I usually describe this effort as explaining how to do DevOps and still pass an audit. One of the topics that keeps coming up in our discussions is whether or not our work is too focused on development and not enough on operations. Recently, one of my colleagues suggested that continuous integration was an example of a key DevOps practice that was entirely focused on development. I disagree completely with this point of view and this article is the first of series explaining how continuous integration, delivery and deployment are fundamental to the work done by any operations team. DevOps is a set of principles and practices which is intended to help development and operations collaborate and communicate more effectively. We have written articles previously (and in our Agile ALM DevOps book) on DevOps principles and practices. The most fundamental concern for the DevOps practitioner is to understand that different stakeholders hold diverse views and often bring to the table varying types of expertise. DevOps is all about ensuring that we all share our knowledge and help each team perform more effectively. In organizations which prioritize DevOps excellence, teams are constantly sharing their screens and sharing their knowledge and expertise. Ops is a key player in this effort and often “the adult in the room” who understands how the application is actually going to perform in a production environment with a production workload. In continuous integration (CI), developers are constantly integrating their code (often by merging source code and components) ensuring that the code that they write can work together and a bad commit does not break the build. CI is a seminal process for any DevOps function, and very few folks would think that you are doing DevOps if you did not implement continuous integration. To demonstrate that Ops does indeed do continuous integration, I could use the low-hanging fruit of describing how infrastructure as code (IaC) is written using Terraform or AWS CloudFormation and, no doubt, such efforts are a valid example of systems engineers who need to integrate their code continuously. But there are more dramatic examples of Ops using continuous integration which present greater challenges. For example, operations engineers often have to manage patching servers at the operating system, middleware and applications levels. Hardware changes may also have to be integrated into the systems which can have far reaching impact, including managing changes to storage and networking infrastructure. Configuration changes, from firewalls to communication protocols, also have to be managed – and continuously integrated. These changes are coming from engineers in different groups, often working in silos and are every bit as complicated as merging code between developers. The work also has to be done using a full systems lifecycle which is often sadly overlooked. Let’s take a look at how we might implement a systems lifecycle in support of continuous integration for Ops. We have been in many conversations where we are told that patches must be applied to a server to ensure compliance with security and regulatory requirements. We don’t doubt the importance of such efforts, but we often see that there is a lack of discipline and due diligence in how patches are deployed to production servers. The first step should always be to obtain a description of the patches (e.g. release notes) that are going to be applied and have them reviewed by the folks who are able to understand the potential downstream impact of the required patches. Sometimes, patches require changes to the code which may necessitate a full development lifecycle. These changes always require due diligence including both functional and non-functional testing. Knowing what to test should start with understanding what the patch is actually changing. Too often, the folks who downloaded the code for the patch neglect to also download and circulate the release notes describing what the patch is actually changing. (Trust me its right there on the website right next to where you downloaded the patch.) Once you have ensured that the release notes for the patch have been reviewed by the right stakeholders (often including both developers and operations systems engineers), then you are in a much better position to work with your QA and testing experts on an appropriate strategy to test those patches (don’t forget load testing). Obviously, you want to promote a patch through a systems lifecycle just like you would promote any piece of code, so start with patching the Dev servers and work your way up to UAT and then schedule the change window for updating the production servers. Keep in mind that these patches are often coming at different levels of the system, including the operating system (perhaps low level), middleware and then application frameworks. You probably need to consider related configuration changes that are required and you may need to coordinate these changes with upgrading your storage or networking system. You’ll want to take an iterative approach and continuously integrate these changes – just as if you were integrating application changes. Remember in DevOps we take a wide systems view and we also ensure that all of the smart people around us are well informed and have an opportunity to share their knowledge and expertise. What other examples of Ops practicing continuous integration can you think of? Drop me a line and share your views!

What is ISO Certification Who Needs It and Why

What is ISO Certification, Who Needs it & Why by Ken Lynch The main purpose of ISO certification is to offer potential clients an independent assessment of a company’s conformity. In the recent years, there has been increased use of technology in business and potential clients are concerned with the issue of data security. ISO certification is, therefore, a way of quelling the fears of potential investors. Security professionals are aware that compliance does not go hand in hand with security. Compliance, therefore, gives future customers a technique to use the business controls as a way of ensuring the clients’ needs are met. What is ISO? In 1946, delegates from twenty-five different countries met at the Institute of Civil Engineers in London. These delegates created an organization referred to as the International Standards Organization (ISO) tasked with forming and unifying industrial standards. Different types of ISO certification ISO standards influence the workings of different industries. For many IT companies, meeting ISO standards is a way of meeting the regulations set out by this organization. In the IT industry, there exist three types of standards that assist an organization in compliance, namely ISO 27001, ISO 31000 and ISO 9001. ISO 27001 standard This standard sets out requirements for an information management system. For organizations looking to meet ISO certification, creating information management systems is necessary. This standard is concerned with ensuring the security, reliability, and availability of information as part of risk management. As a result, it is concerned with assuring consumers. For certification of this standard, there are two stages. The first stage involves a collection of documents by auditors to ensure that a firm’s ISMS is ready for review. The documentation collected by auditors include a company’s ISMS scope, data security procedure, risk identification, and response process, risk review report, company assets, company policies and compliance requirements. ISO 31000 standard This standard outlines the requirements for enterprise risk management (ERM). The risk control process requires that senior management and the board assess the impact and likelihood of risks occurrence in order to determine proper controls to manage risks. When assessing a company’s ERM for certification, auditors look at documents that detail management’s approach to risk identification and mitigation. ISO 9001 standard This standard spells out the requirements for a quality management system (QMS). QMS details the techniques and responsibilities over quality control. The ISO 9001 mainly applies to industries that need quality controls. However, it can also offer a new direction for compliance. Audits in this standard review product, process, and system. The documentation collected by auditors covers both mandatory and non-necessary information. Mandatory documents include document control techniques, internal audit methodology, corrective and preventative action policies and control of non-conformance procedures. Certification of this standard can be overwhelming for many companies. Why is ISO certification necessary? There is a difference between ISO conformation and ISO certification. ISO conformity means that an organization complies with ISO standards. Any company, for instance, carrying out audits internally can implement ISO conformity as part of business operations. ISO certification offers customers assurance about quality control and data management. A certified company is one that conforms to ISO standards. Certification also assures outsiders that a company meets requirements established by a group of experts. Due to the many standards ISO establishes, there is a need for companies to be direct in stating which ISO standard they meet. In addition, ISO certification enables companies to use the opinion of an autonomous third party as evidence of compliance. What does ISO accredited mean? ISO establishes standards but does not issue certificates or take part in the certification process. The Committee on Conformity Assessment (CASCO) determines the standards used for certification, which are in turn used by certification organizations. CASCO, therefore, establishes standards that third parties must use to determine whether a company meets ISO standards. ISO accreditation is different from ISO certification. ISO certification happens after organization policies, techniques and documents are reviewed by an independent third party. When choosing a certification body, an organization should ensure that the third party employs CASCO standards and ensure that they are accredited. However, companies should not assume that non-accredited third parties are incapable of reviewing their company. Accreditation refers to autonomous capability confirmation. In simple words, accredited bodies are those that have been reviewed independently to ensure they meet CASCO standards. This ensures that accredited bodies can properly review other organizations to determine whether they meet ISO standards. How automating GRC can ease the burden of ISO certification The process of managing a company to ensure compliance can confuse managers. Using an automated solution allows a company to determine its controls and conduct a gap analysis so it can manage its workload better. Author Bio

Ken Lynch is an enterprise software startup veteran, who has always been fascinated about what drives workers to work and how to make work more engaging. Ken founded Reciprocity to pursue just that. He has propelled Reciprocity’s success with this mission-based goal of engaging employees with the governance, risk, and compliance goals of their company in order to create more socially minded corporate citizens. Ken earned his BS in Computer Science and Electrical Engineering from MIT.  Learn more at  .

How to Assess and Improve DevOps

How to Assess and Improve DevOps By Bob Aiello and Dovid Aiello Many of my esteemed colleagues are sharing their approaches to implementing continuous delivery and other DevOps best practices. Their approach often assumes that the organization is willing to completely discard their existing processes and procedures and start from what we often refer to as a “green field”. Internet startups can afford to take this approach. But, if you are implementing DevOps in a large financial services firm or a growing ecommerce retailer, you may not be able to stop the train while we lay down new tracks and switches. In my work, I have always had to help teams improve their CM, ALM & DevOps best practices while simultaneously supporting the existing production operations – introducing new processes without disrupting the continuous implementing of new releases and patches to support the business. If you would like to succeed in the real world, then you need to be able to assess your existing practices and come up with a strategy for improving while you still support a vibrant ongoing business. This article explains how to conduct the DevOps assessment. Too often, DevOps practitioners begin the journey by administering a questionnaire that seeks to assess an organization’s DevOps process maturity. The development managers who typically provide the answers usually become quite adept at gaming the questionnaire. For all appearances, they are doing just great at managing their source code, automating builds and delivering releases to production on a fairly rapid basis. However, if you speak to the folks doing the actual work, you may get a very different impression. My assessments start with a pretty robust dance card that includes everyone from the CTO to the engineers who do the actual work. If you are going to implement DevOps, make sure that your dance card includes both developers and the operations engineers. You will also want to speak with the data security engineers, QA testers and, most importantly, the end users or their representatives. I usually start by asking what is going well and what could be improved. If I can get the assessment participants to trust me then I can usually identify which practices really need to be improved and which ones should not be changed – at least not in the beginning. It is important to start by respecting the fact that the organization has evolved and adapted organically, establishing many practices and procedures along their way. You are there to identify what should be improved, but make sure that you start by understanding that the existing approaches may very well be there for good reasons. I believe that you need to respect the journey that the organization has taken to get to this point and then you can focus on what should be improved. I usually interview participants individually or in small groups and I ask what is going well and what could be improved. I probe with an ear to identifying their understanding and application of the six core configuration management best practices of source code management, build engineering, environment management, change control, release management and deployment engineering. I then try to help the team determine specific initiatives that should be addressed on their journey to implementing DevOps. My report synthesizes common themes and helps to highlight which items are more urgent than others. It is essential to create an environment where everyone is comfortable being open and truthful so I never report on what one particular person has stated – instead my results group items along the six core CM best practices. One practice consideration is to pick one or two items that can be achieved easily and quickly. Don’t tackle the tough ones first. After you have achieved some small successes, addressing the bigger challenges will become a lot more practical. Use the information from your assessment to help plan the roadmap for your own process improvement initiative. Some common initiatives that I often see are:
  • Need for training when adopting new technologies from Bitbucket to Ansible
  • Automation of common tasks that everyone has become accustomed to doing manually
  • Monitoring and addressing failed builds in the continuous integration server
  • Improving coverage for automated testing from unit through functional/regression and don’t forget non-functional testing (including performance)
  • Improving communication and collaboration by breaking down those silos (hint: get everyone to share a screen and show what they are working on)
  • Bringing in a vendor every now and then and making sure their demo is focused on teaching best practices (instead of just a sales pitch)
  • Making it cool to show off what your team is doing
Most of all, remember that change takes time, so take an agile iterative approach to process improvement. As always, view DevOps initiatives as a “team sport” and feel free to drop me a line and share your experiences!  

How to Fix Change Control – Understanding DevOps’ Secret Weapon

How to Fix Change Control – Understanding DevOps’ Secret Weapon by Bob Aiello with Dovid Aiello In many organizations, Change Control is badly broken. Very often, change control only focuses on the calendar and fails to realize its true potential of ensuring that changes can be delivered as frequently as necessary in a secure and reliable way. In our consulting practice, we hear many complaints of boring two-hour meetings where nothing seems to actually get done. Change control is often perceived as being little more than a rubber stamp and, as one esteemed colleague famously claimed publicly, that the purpose of change control was to “prevent change”. We disagree and believe that change control can be a valuable function that helps identify and mitigate technical risk. That said, very few organizations have effective change control practices. Here’s how to fix change control in your company and realize the benefits of DevOps’s secret weapon. We have previously written about the dysfunction that often resides in the operations organization. This dysfunction often involves change control. The poorly-managed change control function wastes time, but that is only the tip of the “dysfunctional” iceberg. Far more serious is the missed opportunity to identify and mitigate serious technical risks. This failure often results in incidents and problems – often at a loss to the company, both in terms of profitability, as well as reputation. The bigger problem is the missed opportunity to be able to roll out changes faster and thus enabling secure and reliable systems, not to mention, delivering business functionality. When change control fails, most senior managers declare that they are going to stop allowing changes – which is actually the worst thing that you can decide to do. Slowing down changes almost always means that you are going to bunch together many changes and allow change windows less frequently, such as on a bimonthly basis. When we help teams fix their change control, the first thing we push for is making more frequent changes, but keeping them to very tiny changes. The typical cadence that we find that works well is most often moving from bimonthly change windows to twice weekly – ideally during the week. There is something magical about moving from bimonthly to twice a week that often eliminates much of the noise and frustration. One important approach is to identify which changes are routine and low-risk, categorizing them as “pre-approved” or standard changes. Changing a toner cartridge on a printer is a good example, as it is a task that has been done many times before. Communication that the printer will be down for this activity is important, but it does not require a discussion during the change control meeting. Standard changes should be, ideally, fully automated and if you are using the ITIL framework, listed in your [1] service catalogue. Getting all of the easy changes pre-approved means that your change control meeting can now focus on the changes which actually require some analysis of technical risk. Normal changes follow your normal change control process. Emergency changes should only be used when there is a true emergency and not just someone trying to bypass the change control process. Occasionally, someone may miss the normal change control deadline and then you may need an “out-of-cycle” change that would have been a normal change had the person made the deadline. One effective way to ensure that folks are not just using emergency changes to bypass the change control process is to require that all emergency changes be reviewed by your highest-ranking IT manager – such as the CTO. Another effective approach is to distinguish between the change control board (CCB) and the change advisory board (CAB). Frankly, this has been an area of confusion for many organizations. The CCB is responsible for the change control process. The change advisory board should be comprised of sharp subject matter experts who can tell you the potential impact of making (or not making) a particular change. Make sure that you ask them who else might be impacted and should be brought into the discussion. We have seen many organizations, unfortunately, rename their CCB to CAB (cool name it is) and in doing so, lose the input from the change advisory folks. Keep your CCB separate from your CAB. The CCB handles the process – while the CAB advises on the technical risk of making (or perhaps not making) a particular change. In reviewing each change, make sure that the change is described clearly and in sufficient detail to understand each step. We see many change requests that are just high-level descriptions which can be open to interpretation by the person making the changes and consequently result in human errors that lead to incidents and problems. Testing, as well as verification and validation (V&V), criteria should always be specified. By testing, we refer to continuous testing beginning with unit testing and extending into other forms of testing, including regression, functional/nonfunctional, and integration testing. (We are huge fans of API and service virtualization testing, but that is the subject of another article.) Verification usually refers to whether or not the change meets the requirements and validation ensures that the system does what it needs to do. Some folks refer to fitness for warranty and fitness for use. If you want effective DevOps you must have robust continuous testing and the change control process is the right toll gate to ensure that testing has been implemented and especially automated. We’d be remiss if we did not mention the importance of asking about backout plans. In short, always have a plan B. Change control done well is indeed the DevOps’ secret weapon. Making changes more often should be your goal and keeping those changes as tiny and isolated as possible will help to reduce the risk of making changes. We like to have everyone share a screen and have that DevOps cross-functional team ensure that every change is executed successfully. Every change must be automated. If this is not possible, then make sure that you have a 4-eyes policy where one person makes the change and another person observes and verifies that the manual step has been completed successfully. Always record the session – and allow others to see what you’re doing and then review the recordings to identify areas where you can improve your deployment processes. The best organizations have processes which are transparent and allow others to learn and help continuously improve the deployment process. Change control can help you get to a place where you can safely make changes as often as you need to, helping to deliver secure and reliable systems. What change control practices do you believe are most effective? Drop us a line and share your best practices!     [1] The service catalogue is an automated set of jobs that perform routine “low risk” tasks such as taking backups and changing toner cartridges. Since the request is in the service catalogue, the change may be designated as being “standard” (pre-approved) and then there is no need to perform risk assessment in change control.

DevOps for Hadoop

DevOps for Hadoop By Bob Aiello Apache Hadoop is a framework that enables the analysis of large datasets (or what some folks are calling “Big Data”), using clusters of computers or cloud-based “elastic” resources offered by most cloud-based providers. Hadoop is designed to scale up from a single server to thousands of machines, allowing you to start with a simple installation and scale to a huge implementation capable of processing petabytes of structured (or even unstructured) complex data. Hadoop has many moving parts and is both elegant in its simplicity and challenging in its potential complexity. This article explores some of the essential concerns that the DevOps practitioner needs to consider in automating the systems and applications build, package and deployment of components in the Hadoop framework. For this article, I read several books [1][2] and implemented a Hadoop sandbox using both Hortonworks as well as Cloudera. In some ways, I found the technology to be very familiar as I have supported very large Java-based trading systems, which were deployed in web containers from Tomcat to Websphere. At the same time, I was humbled by the sheer number of components and dependencies that must be understood by the Hadoop administrator as well as the DevOps practitioner. In short, I am kid in a candy store with Hadoop, come join me as we sample everything from the lollipops to the chocolates 🙂 To begin learning Hadoop you can set up a single node cluster. This configuration is adequate for simple queries and more than functional for learning about Hadoop and then scaling to a cluster setup when you are ready to implement your production environment. I found the Hortonworks sandbox to be particularly easy to implement using virtual box (although almost everything that I do these days is on docker containers). Both Cloudera Manager and Apache Ambari are administration consoles that help to deploy and manage the hadoop framework. I used Ambari which has features that help provision, manage, and monitor Hadoop clusters, including supporting the Hadoop Distributed File System (HDFS), Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. I used the Ambari dashboard which helps to view cluster health (including heatmaps) and also provides the ability to view MapReduce, Pig and Hive applications including performance and other resources. HDFS can also be supported on Amazon Simple Storage Service (S3) buckets, Azure blobs and OpenStack Object Storage (Swift) . One of the most interesting things about Hadoop is that you can use “commodity hardware” which does not necessarily mean cheap, but does mean that you use the servers that you are able to obtain and support and they do not all have to be the exact same type of machine. Obviously, the DevOps practitioner will need to pay attention to the requirements for provisioning, and especially supporting, the many required configuration files. This is where configuration tools including Chef, Puppet, CFEngine, Bcfg2 and SaltStack can be especially helpful, although I am finding myself migrating towards using Ansible for configuration management as well as both infrastructure and application deployments. Logfiles, as well as the many environment settings, need to be monitored. The administration console provides a wide array of alerts which can identify potential resource and operational issues before there is end-user impact. The Hadoop framework has daemon processes running, each of which require one or more specific open ports for communication. Anyone who reads my books and articles knows that I emphasize processes and procedures which ensure that you can prove that you have the right code in production, detect unauthorized changes and have the code “self-heal”, through returning to a known baseline (obviously while adhering to change control and other regulatory requirements). Monitoring baselines in a complex java environment utilizing web containers including Tomcat, jBoss and WebSphere can be very difficult especially because there are so many files which are dynamically changing and therefore should not be monitored. Identifying the files which should be monitored (using cryptographic hashes including MAC SHA1 and MD5) can take some work and should be put in place from the very beginning of the development lifecycle in all environments from development test to production. In fact getting Hadoop to work in development and QA testing environments does take some effort, giving you an opportunity to start working on your production deployment and monitoring procedures. The lessons learned while setting up the lower environments (e.g. development test) can help you begin building the automation to support your production environments. I had a little trouble getting the Cloudera framework to run successfully using docker containers, but I am confident that I will get this to work in the coming days. Ultimately, I would like to use docker containers to support Hadoop – mostly likely with Kubernetes for orchestration. You can also run Hadoop on hosted services such as AWS EMR or use AWS EC2 instances (which could get pretty expensive). Ultimately, you want to run a lean environment that has the capability of scaling to meet peek usage needs, but can also scale down when all of those expensive resources are not needed. Hadoop is a pretty complex system with many components and as complicated as the DevOps requirements are, I cannot imagine how impossible it would be to manage a production Hadoop environment without DevOps principles and practices. I am interested in hearing your views on best practices around supporting Hadoop in production as well as other complex systems. Drop me a line to share your best practices! Bob Aiello ( [1] Hadoop the Definitive Guide, by Tom White, O’Reilly Media; 4 edition, 2015 [2] Pro Apache Hadoop, by Jason Venner et al, Apress; 2nd ed. edition, 2014 [3] Aiello, Robert and Leslie Sachs. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley, 2010 [4] Aiello, Robert and Leslie Sachs. Agile Application Lifecycle Management: Using DevOps to Drive Process Improvement, Addison-Wesley, 2016

The Magic of DevOps

The Magic of DevOps By Bob Aiello with Dovid Aiello Recently, a well-respected colleague of mine reacted to an article that I had written regarding the Equifax data breach and suggested that I had made it sound as if DevOps “magically” could solve problems. I was stunned at first when I saw his comments, because everything that I have written has always gone into specific details on how to implement DevOps and CM best practices including the core functions of source code management, build engineering, environment management, change control, and release and deployment engineering. At first, I responded to my colleague that he should get a copy of my book to see the detail in which I prescribe these principles and practices. My colleague reminded me that he not only had a copy of my CM best practices book, but had reviewed it as well – and as I recall it was a pretty positive review. So how then could he possibly believe that I viewed DevOps as magically solving anything? The more I pondered this incident, the more I realized that DevOps does indeed have some magic and the effective DevOps practitioner actually does have some tricks up his sleeve. So, unlike most magicians I am fine with sharing some of my magic and I hope that you will write back and share your best practices as well. DevOps is a set of principles and practices intended to help improve communication and collaboration between teams, including development and operations, but equally important are other groups including quality assurance, testing, information security and of course the business user and the groups who help us by representing their interests. DevOps is all about sharing often conflicting ideas and the synergy we enjoy from this collaboration. At the core of DevOps systems and application delivery is a set of practices based upon configuration management, including source code management, build engineering, environment management, change control, and release and deployment engineering. Source code management is fundamental and without the tools and processes you could easily lose your source code, not to mention, have a very difficult time tracking changes to code baselines. With robust version control systems and effective processes, you can enjoy traceability to know who made changes to the code – and back them out if necessary. When you have code in version control you can scan that code using a variety of tools and identify open source (and commercial) code components which may have one or more vulnerabilities as identified in CVEs and the VulnDB database. I have written automation to traverse version control repositories and scanned for licensing, security and operational risks – actually identifying specific code bases which had zero-day vulnerabilities much like the one which impacted Equifax recently. Build engineering is a fundamental part of this effort as the build process itself may bring in dependencies which may also have vulnerabilities. Scanning source code is a good start, but you get a much more accurate picture when you scan components which have been compiled and linked (depending upon the technology). Taking a strong DevOps approach means that your security professionals can work directly with your developers to identify security vulnerabilities and identify the best course of action to update the code. With the right release and deployment automation you can ensure that your changes are delivered to the production as quickly as necessary. Environment management is the function which is most often forgotten and understanding your environment dependencies is an absolute must-have. Sadly, we often forget to capture this information during the development process and discovering it after the product has been deployed can be a slow and painful process. Similarly, change control should be your first line of defense for identifying and mitigating technical risk, but most companies simply view change as a “rubber stamp” which focuses mostly on avoiding calendar collisions. Change control done well can help you identify and mitigate technical risk, deliver changes more quickly and avoid costly mistakes so often the cause of major outages. As news reports emerge claiming that Equifax actually knew that they had code which contained the Struts vulnerability, the focus should be on why the code was not updated. Sadly, many companies do not have sufficient automation and processes in place to be able to safely update their systems without introducing significant technical risk. I have known of companies who could not respond effectively to a disaster, because their deployment procedures were not fully automated and failing over to a DR site resulted in a week-long set of outages. Companies may “test” their DR procedures, but that does not guarantee that they can actually be effectively used in a real disaster. You need to be able to build your infrastructure (e.g. infrastructure as code) and deploy your applications without any chance of a step being missed which could result in a systems outage. DevOps and CM best practices actually give you the specific capabilities required to identify security vulnerabilities and update your code as often as needed. The first step is to assess your current practices and identify specific steps to improve your processes and procedures. I would like to say that really there is no magic – just good process engineering, picking the right tools and of course rolling up your sleeves and automating each and every step. But maybe the truth is that there is some magic here. Taking a DevOps approach, sharing the different views between development, operations, security and other key stakeholders can make this all possible. Please drop me a line and share your challenges and best practice too. Between us, we can see magic happen as we implement DevOps and CM best practices! Bob Aiello (  

How to Maintain 143 Million Customers

How to Maintain 143 Million Customers by Phil Galardi So what recently happened to 143 Million Americans anyway? Well, you probably heard that it was a cyber security incident related to an open source software component called Apache Struts. What exactly is Apache Struts? Why was it so easily hacked? Could it be prevented using some common best practices? And, what can you do to protect your organization now and in the future? Apache Struts is a common framework for developing Java web applications. It’s one of the most commonly used open source components, with plenty of community support. 634 commits in the last 12 months at the time of this blog, meaning that folks from all over the world are actively participating in efforts to fix bugs, add features/functions, and remediate vulnerabilities. According to Lgtm, the folks who discovered the vulnerability that impacted nearly half of Americans, more than 65% of Fortune 100 companies are using Struts meaning 65% of the Fortune 100 could be exposed to remote attacks(similar to Equifax) if not fixed. Initially, the suspect vulnerability was a zero-day (CVE-2017-9805), impacting the Struts framework since 2008. However, recent speculation is pointing to a more likely culprit (CVE-2017-5658) which was reported in March 2017. If the latter is the case, Equifax and any other organizations properly managing open source components would have had visibility into this issue and could have remediated it before the attack occurred. At this time, Equifax has not issued a public statement pinpointing the exploit. The Apache Struts Project Management Committee lists 5 steps of advice to anyone utilizing Struts as well as all open source libraries. To paraphrase, these are:
  1. Know what is in your code by having a component bill of materials for each of your product versions.
  2. Keep open source components up to date and have a process in place to quickly roll out security fixes.
  3. Close the gap, your open source components are going to have security vulnerabilities if unchecked.
  4. Establish security layers in your applications, such that a public facing layer (such as Struts) should never allow access to back-end data.
  5. Monitor your code for zero-day vulnerability alerts. Again, back to #1. If you know what is in your code, you can monitor it. You can reduce incidence response time, and notify your customers quickly (or catch it before it’s too late).
Certainly, you can prevent Apache Struts vulnerabilities from ever making their way into your web applications by not using the component. However, based on metrics from Black Duck software for Struts we see that it would take an estimated 102 years of effort to build on your own. You probably won’t need every line of code. Yet even still, there are huge advantages to using open source software in your applications. Best practices dictate identifying the open source components in your applications at the time of build and integrating into CI tools when possible. This provides you with an inventory or bill of materials for all the open source developers are using. You can further drive automation by monitoring those applications bill of materials and creating policies around what actually get’s built. For example, you could warn and notify developers that a particular component (OpenSSL 1.0.1 through 1.0.1f) is not acceptable to use if they build it and ultimately fail builds containing critical vulnerabilities. What can you do now about this latest vulnerability? According to Mike Pittenger, VP of Security Strategy at Black Duck Software, if you don’t need the REST plug-in for Apache Struts, you can remove it. Otherwise, users are advised to update to versions 2.3.34 and 2.5.13 as soon as possible. So back to keeping your customers happy? Protect their data, maintain the security of your applications, and don’t forget about open source components and applying best practices. About the author: Phil Galardi has over 15 years of experience in technology and engineering; 8 years as an application developer, 3 years in application lifecycle management and currently helping organizations improve, manage, and secure their SDLC. With experience spanning multiple vertical markets, Phil understands what is required to build secure software from each aspect of people, process, and technology.  While he loves coffee, he doesn’t get the same feelings of joy from completing expense reports  

Personality Matters – Development in the Cloud

Personality Matters – Development in the Cloud By Leslie Sachs Developing software in the cloud involves working with people who are likely in a different location and employed by an entirely different company. These folks may have very different priorities than you do – and getting what you need may be quite a challenge at times. Development in the Cloud likely involves working in an environment where you do not always have full control of the resources you need. You may feel that you are the customer and deserve priority service. But the reality is that interfacing with all of the stakeholders in a cloud-based development environment presents unique challenges. Your ‘people skills’ may very well determine whether or not you get what you need when you need it. Read on if you would like to be more effective when developing in the Cloud. Control of Resources Development in the Cloud has a number of challenges, but none more apparent than the obvious loss of control over essential resources. Development in the cloud involves relying upon another entity and the services that they provide. This may turn out great for you, or it may be the worst decision of your career. One issue to address early on is how you feel about having to rely upon others for essential resources and its implicit loss of control. This situation may result in some stress and, at times, considerable anxiety, for technology managers who are responsible for the reliability of their companies’ systems. Anxiety in the Cloud Seasoned IT professionals know all too well that bad things happen. Systems can crash or have other serious outages that can threaten your profitability. When you have control over your resources, you usually have a stronger sense of security. With the loss of control, you may experience anxiety. As a manager, you need to assess both your, and upper management’s, tolerance for risk. Risk is not inherently bad. But risk needs to be identified and then mitigated as best as is practical. One way to do that is to establish a Service-level Agreement (SLA). Setting the SLA The prudent manager doing development in the cloud will examine closely the Service-level Agreements that govern the terms of the Cloud-based resources upon which that team depends. One may have to choose, however, between working with a large established service provider and a smaller company, willing to work harder for your business. This is where you need to be a savvy consumer and technology guru, too. If you’re thinking that ironing out all of these terms is going to be easy, then think again. The one thing that you can be certain about, though, is that communication is key. Communication as Key Make sure that you establish an effective communications plan to support your Cloud development effort, including announcing outages and service interruptions. [1] You should consider the established communications practices of your service provider within the context of the culture of your organization. Alignment of communication styles is essential here. Plan to not only receive communications, but to process, filter and then distribute essential information to all of your stakeholders. Remember, also, that even weekend outages may impact the productivity of your developers. The worst part is that you may not have a specific dedicated resource at the service provider with whom to partner. Faceless and Nameless Partners Many large Cloud-based providers have well-established service organizations, but you as a manager need to consider how you feel about working with partners who you do not know and may never actually meet. The faceless and nameless support person may be just fine for some people especially if they do a great job. But you need to consider how you will feel if you cannot reach a specific person in charge when there is a problem impacting your system. This may seem like a non-issue if you are the customer. Or is it? Customer Focus If you are paying a service provider then you will most likely be expecting to be treated as a customer. Some Internet Service Providers (ISPs) may have excellent service while others may act like they are a utility with an absolute monopoly. At CM Best Practices Consulting, we’ve had some experiences with ISPs who provided horrible service resulting in an unreliable platform supporting the website for our book on Configuration Management Best Practices. Poor service aside, there are certainly advantages to considering cloud services as a utility. Cloud as a Utility When you need more electricity, most of us just assume that the Electric Company will provide as much as we need. So the cloud as a utility certainly has some advantages. If you need to scale up and add another hundred developers, giving each one a virtual image on a server farm can be as easy as providing your credit card number. However, knowing that additional resources are there for the asking, does have its own special risk of failing to plan for the resources that you need. You still need to plan strategically. Planning and Cost Planning and cost can be as dangerous as running up bills on your credit card. In fact, they may actually be on your credit card. From a personality perspective, you should consider whether or not using Cloud-based services is just a convenient excuse to avoid having to plan out the amount of resources you really need. This approach can get expensive and ultimately impact the success of your project. Development in the cloud does not mean that you have to stay in the cloud. In fact, sometimes cloud-based development is just a short-term solution to respond to either a seasonal need or a temporary shortage. You should always consider whether or not it is time to get out of the clouds. Bringing it Back In-house Many Cloud-based development efforts are extremely successful. Others are not. Ultimately, smart technology professionals always consider their Plan-B. If you find that you are awake at night thinking about all of the time lost due to challenges in the Cloud, then you may want to consider bringing the development back in from the Cloud. Just keep in mind that every approach has its risks and you probably cannot implement a couple hundred development, test or production servers overnight, either. Many managers actually use a hybrid approach of having some servers in-house, supplemented by a virtual farm via a cloud-based service provider. Having your core servers in-house may be your long term goal anyway. Smart managers consider what works best from each of these perspectives. Conclusion Being pragmatic in the Cloud means that you engage in any technology effort while keeping both eyes open to the risks and potential advantages of each approach. Cloud-based development has some unique challenges and may not be the right choice for everyone. You need to consider how these issues fit with you and your organization when making the choice to develop in the cloud. References [1] Aiello, Robert and Leslie Sachs. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley, 2010, p. 155.

How DevOps Could Have Saved Equifax

How DevOps Could Have Saved Equifax by Bob Aiello Equifax is the latest large firm to make unwanted headlines due to exposure of clients’ personal data; a reported 143 million people may have had their Social Security numbers, birth dates, credit card numbers and other personal information stolen. According to published accounts, the breach occurred through a vulnerability in the Apache Struts web framework which is used by many organizations. The incident was an embarrassment to a company whose entire business revolves around providing a clear, and presumably confidential, financial profile of consumers that lenders and other businesses use to make credit decisions. Large organizations often have hundreds of major systems using thousands of commercial and open source components – each of which could potentially have a security vulnerability. The Apache organization issued a statement about the most recent incident. There were also many alerts issued about the potential risks in the Apache Struts framework, but large organizations which receive alerts via Common Vulnerabilities and Exposures (CVEs) and VulnDB may find it very difficult to identify exactly which software components are vulnerable to attack and be unable to quickly fix the problem and/or deploy the updated code that prevents hackers from exploiting known vulnerabilities. So how best to handle these scenarios in large organizations? The first step is to get all of your code stored and baselined in a secure version control system (VCS). Then you need to be able to scan the code using any of the products on the market which can identify vulnerabilities as reported in CVEs and the VulDB database. There are costs involved with implementing an automated solution, but the cost of not doing so could be far greater. One approach could be to clone each and every repo in your version control system (e.g. bitbucket) and then programmatically scan the baselined source code identifying the projects which contain these vulnerabilities. You can get better results if you scan code that has been compiled, as the build process may pull in additional components. But even just scanning source code will help you get the conversation involving your security experts, operations engineers and the developers who wrote the code started. Suddenly, you can find that needle lost in your haystack pretty quickly and begin taking steps to update the software. Obviously, another key ingredient is having the capability to immediately roll out that fix through a fully-automated application build, package and deployment process, what many folks are referring to as continuous delivery. Implementing these tools and processes does take some time and effort, but, as the Equifax data breach has painfully demonstrated, effective DevOps is clearly worth it. What is your strategy for identifying security issues buried deep in a few hundred thousand lines of code? It is actually not that hard to fix this issue as long as you can work across development, operations and other stakeholders to implement effective CM best practices including: Source Code Management Build Engineering Environment Management Lean and Effective Change Control Release & Deployment engineering yeah – I am saying that you need DevOps today! Bob Aiello  

How DevOps can eliminate the risk of ransomware and cyber attacks

How DevOps can eliminate the risk of ransomware and cyber attacks By Bob Aiello Reports of global cyberattacks, said to have impacted more than 200,000 users in over seventy countries, have certainly garnered much attention lately. You might think that the global IT “sky” was falling and there is no possible way to protect yourself. That just isn’t true. The first thing that you need to understand is that any computer system can be hacked, including yours. In fact, you should assume that your systems will be hacked and you need to plan for how you will respond. Experts are certainly telling us all to refrain from clicking on suspicious attachments and to keep our Windows patches up-to-date. None of that advice is necessarily wrong, but it fails to address the real problem here. In order to properly avoid such devastating problems, you really need to understand the root cause. There certainly is plenty of blame to go around, starting with who made the malware tools used in the attack. There is widespread speculation that the tool used in the attack was stolen from the National Security Agency (NSA), which leads one to question whether those agencies in the business of securing our national infrastructure are really up to the job. This global cyberattack was felt by thousands of people around the world. Hospitals across the UK were impacted, which, in turn, impacted medical care, even delaying surgical procedures. Other organizations hit were FedEx in the United States, the Spanish telecom company Telefónica, the French automaker Renault, and Deutsche Bahn, Germany’s federal railway system. I have supported many large organizations relying upon Windows servers, often running complex critical systems. Building and upgrading windows software can be very complex and that is the key consideration here. It is often not trivial to be able to rebuild a Windows machine and get to a place where you have software fully functioning as required. I have seen teams tinker with their windows build machines and actually get to a place where they simply could not build another windows machine with the same configuration. Part of the problem is that very few technology professionals really have an understanding of how Microsoft actually works, which really is the problem here. In DevOps, we need to be able to provision a server and get to a known state without having to resort to heroic efforts. With infrastructure as code, we build and provision machines from scratch using a programmatic interface. Many use cloud-based resources which can be provisioned in minutes and taken down just as easily when no longer needed. Container-based deployments are actually taking this capability to the new level making infrastructure as code routine. From there, we can then establish the deployment pipeline and within a reasonable period of time, which enables us to deploy known baselines of code very quickly. Backing up our data is also essential and in practice, you may lose some transactions. If you are supporting a global trading system, then obviously there must be strategies to create transactions logs which can “replay” your buys and sells, restoring you to a known state. What we need now is for corporate executives to understand the importance of having competent IT executives implement best practices capable of addressing these inevitable risks. We know how to do this. In the coming months, a group of dedicated IT professionals will be completing the first draft of an industry standard for DevOps, designed to cover many of these considerations. Let’s hope the folks running institutions from hospitals to retail chains take notice and actually commit to adopting DevOps best practices.