Homepage – Tech | Agile ALM DevOps

Monitoring Your Environment

Environment Monitoring

+

-

June 29, 2016

Monitoring your runtime environment is an essential function that will help you proactively identify potential issues before they escalate into incidents and outages. But environment monitoring can be pretty challenging to do well. Unfortunately, environment management is often overlooked and, even when addressed, usually only handled in the simplest way. Keeping an eye on your environment is actually one of the most important functions for IT operations. If you spend the time understanding what really needs to be monitored and establish effective ways of communicating events, then your systems will be much more reliable—and you will likely get a lot more sleep without so many of those painful calls in the middle of the night. Here’s how to get started with environment management.

The ITIL v3 framework provides pretty good guidance on how to implement an effective environment management function. The first step is to identify which events should be monitored and establish an automated framework for communicating the information to the stakeholders who are responsible for addressing problems when they occur. The most obvious environment dependencies are basic resources such as available memory, disk space, and processor capacity. If you are running low on memory, disk space, or any other physical resource, then obviously your IT services may be adversely impacted. Most organizations understand that employees need to monitor key processes and identify and respond to abnormal process termination. Nagios is one of the popular tools to monitor processes and communicate events that may be related to processes being terminated unexpectedly.

There are many other environmental dependencies, such as ports being opened, that also need to be monitored on a constant basis. I have seen production outages caused by a security group closing a port because there was no record that the port was needed for a particular application. These are fairly obvious dependencies, and most IT shops are well aware of these requirements. But what about the more subtle environment dependencies that need to be addressed?

I have seen situations where databases stopped working because the user account used by the application to access the database locked up. Upon investigation, we found that the UAT user account was the same account used in production. In most ways, you want UAT and production to match, but in this case locking up the user account in UAT took down production. You certainly don’t want to use the same account for both UAT and production, and it may be a good idea to set up a job that checks to ensure that the database account is always working.

Market data feeds are another example of an environment dependency that may impact your system. This one can be tricky because you may not have control over a third-party vendor who supplies you with data. This is all the more reason why you want to monitor your data feeds and notify the appropriate support people if there is a problem. Cloud-based services may also provide some challenges because you may not always be in control of the environment and might have to rely on a third party for support. Establishing a service-level agreement (SLA) is fundamental when you are dependent on another organization for services. You may also find yourself trying to figure out how your cloud-based resources actually work and what you need to do when your service provider makes changes that may be unexpected and not completely understood. I had this experience myself when trying to puzzle my way through all of the options for Amazon Cloud. In fact, it took me a few tries to figure out how to turn off all of the billable options such as storage and fixed IPs when the project was over. I am not intending to criticize Amazon per se but even their own help desk had trouble locating what I needed to remove so that I would stop getting charged for resources that I wasn’t using.

To be successful with environment management, you need to establish a knowledge base to gather the essential technical information that may be understood by a few people on the team. Documenting and communicating this information is an important task and often requires effective collaboration among your development, data security, and operations teams.

Many organizations including financial services are working to establish a configuration management database (CMDB) to facilitate environment management. The ITIL framework provides a considerable amount of guidance on how to establish a CMDB and the supporting configuration management system (CMS), which helps to provide some structure for the information in the CMDB. The CMDB and the CMS must be supported by tools that monitor the environment and report on the status of key dependencies on a constant basis. These capabilities are essential for ensuring that your critical infrastructure is safe and secure.

Many organizations monitor port level scans and attacks. Network intrusion detection tools such as SNORT can help to monitor and identify port-level activity that may indicate an attempt to compromise your system is underway. Ensuring that your runtime environment is secure is essential for maintaining a trusted computing environment. There have been many high-profile incidents that resulted in serious system outages related to port-level system attacks. Monitoring and recognizing this activity is a first step in addressing these concerns.

In complex technology environments you may find it difficult to really understand all of the environment requirements. This is where tying together your support application lifecycle is essential. When bad things happen, your help desk will receive the calls. Reviewing and understanding incidents can help the entire team identify and address environment-related issues. Make sure that you never have the same problem twice by having reported incidents fully investigated with new environmental dependencies identified and monitored on an ongoing basis.
Conclusion

Environment management is a key capability that can help your entire team be more effective. You need to provide a structure to identify environment dependencies and then work with your technical resources to implement tools to monitor environment dependencies. If you get this right, then your organization will benefit from reliable systems and your development and operations

ALMtoolbox presents smart performance monitoring and alerting tool, including Free Community Edition

News

+

-

June 29, 2016

0

Tel Aviv, Israel – June 28, 2016 – IBM Champion ALMtoolbox, Inc., a firm with offices in the United States and Israel, today announced availability of a free Community edition product called ALM Performance, based upon ALM Performance Pro, their award-winning environment monitoring commercial solution.

The Community edition of ALM Performance provides a comprehensive set of over twenty environment-monitoring features including monitoring your ClearCase VOB ,Jenkins and ClearQuest servers along with their respective JVMs. The product also monitors available storage, required ports, memory and CPU load while checking also for components of the application itself, such as Jenkins jobs which may be running too long or other possible Jenkins application problems. You can even write your own custom scripts and integrate them with the ALM performance dashboard. The user interface allows for email alerts, filtering notifications and other custom alerts.

Easily upgraded to the Commercial Pro edition for additional scalability and convenience, the Community Edition alone offers the following features.

ALM Performance Highlights – 3 main components:

Settings application – configuration tool for the ALM Performance, allows you to add or delete monitored servers and configure the server’s checks and parameters
Graphical component – graphical dashboard of the system.
Monitoring service – heart of the system – monitoring service is the component that schedules, runs tests and analyzes the results.

ALM Performance can monitor all Linux, UNIX, Mac OS and Windows versions and it does so in a non-intrusive and secure manner, using advanced SSH protocol. ALM Performance is installed on a Windows host and can be run on-premise or it can be run as a cloud service where we run and manage the system while it remotely monitors your servers.

“Over the years, we have provided a variety of robust tools that share our techniques and expertise with the user community, and now we are doing so with performance monitoring issues. We wanted to share our knowledge with the users, and help them improve their skills so that they can respond more effectively when they have to cope with malfunctions or Systems suffering from slow-response and other forms of latency, says Tamir Gefen, ALMtoolbox CEO.

“We have built this tool after many years of experience with SCM administration, IT management and DevOps, and we created it by envisioning a tool that’s made a priori for Jenkins, ClearCase or ClearQuest rather than just another cut-off-the-shelf monitoring tool that users have to spend months planning what to monitor and how to customize. Using this tool it takes only 1 hour to start getting status data and insights from your monitored hosts”, says Gefen.

“We always strive to benefit the users’ communities and provide a version that can provide the essential features for each company that uses Jenkins, ClearCase or ClearQuest” says David Cohen, the product manager of the new ALM Performance monitoring tool.

“Since it’s software with an easy installation, we are excited that we are able to provide the Community version, including self-installation, for free”, says Cohen.

To download the product, visit ALM toolbox and click the Download link.

Updates and support via email, phone and desktop sharing are available with either product.

Personality Matters- Anxiety and Dysfunctional Ops

Personality

+

-

June 28, 2016

0

Personality Matters- Anxiety and Dysfunctional Ops
By Leslie Sachs

As software professionals, we might find ourselves calling a help desk from time to time. Our needs may be as simple as dealing with a malfunctioning cell phone or as complex as navigating banking or investment systems. The last time you called a help desk, you may have been pleased with the outcome or disappointed in the service provided. The last time I called a help desk, I found myself trying to navigate what was obviously a dysfunctional organization. While the ITIL framework provides guidance on establishing an effective service desk, many organizations still struggle to provide excellent service. The root cause may have much to do with a personality trait known as anxiety and the often-dysfunctional defense mechanisms people resort to in an attempt to deal with its discomfort. If you want your IT operations (IT Ops) group to be successful, then you need to consider the personality issues at individual as well as group levels that may impact their performance and your success. In order for you to understand a dysfunctional help desk, you need to know the personality traits that lead to the negative behaviors preventing you from receiving good service. Often, callers experience frustration and anger when they find themselves unable to elicit the response that they desire. Sometimes IT Ops professionals provide less than perfect service and support because they just don’t know how to solve the problem, and they lack the training and expertise needed to be effective and successful in their jobs. If you are frustrated as a customer, imagine how stressful it is for the support person who cannot successfully handle your request or the IT Ops professional who lacks the necessary technical expertise required to solve the problem. When your job is on the line, you may indeed feel extreme anxiety.

Anxiety is defined as an emotional state in which there is a vague, generalized feeling of fear [1]. Operations staff often find themselves under extreme stress and anxiety, especially when dealing with a systems outage. Some folks handle stress well, while others engage in disruptive behavior that may be as simple as blaming others for the outage or as complex as avoidance behaviors that could
potentially impact the organization. Sigmund Freud discussed many defense mechanisms that people often employ to deal with and reduce anxiety, and he conceptualized that many people develop behavior problems when they have difficulties learning[1]. We often see this phenomena being triggered when employees are required to complete tasks for which they have not been properly prepared. IT Ops team members must learn a considerable amount of information in order to understand how to support complex systems and deal with technology challenges that often arise when they face a critical systems outage.

Developers are often at an advantage because they get to learn new technologies first and sometimes get to choose the technical architecture and direction of a project. However, IT Ops team members must struggle with getting up to speed, and they are wholly dependent upon the information that they are given during the stage in which the technology transitions from development to operations. This knowledge transfer effort impacts the entire support organization. Organizations which fail to implement adequate knowledge transfer processes will have support personnel who are ill-equipped to handle situations which depend on familiarity and competence with the knowledge base American psychologist Harry Harlow proposed that a relationship exists between the evolutionary level of a species and the rate at which members of that species are able to learn[1]. Similarly, an
organization’s ability to transfer knowledge is an indication of how it will successfully deal with supporting complex technologies. The entire team may be adversely impacted when an organization cannot manage its essential institutional knowledge. As Jurgen Appelo notes, “knowledge is built from the continuous input of information from the environment in the form of education and learning, requests and requirements, measurements and feedback, and the steady accumulation of experience. In short, a software team is the kind of system that consumes and transforms information and produces
innovation”[2].

All this means that development and operations must share knowledge in order for the organization to be successful. Quality Management expert, W. Edwards Deming, aptly noted that it is essential to “drive
out fear”[3]. To remove fear and anxiety from the work environment, all members need the knowledge and skills to be able to perform their duties. Technology professionals cannot function optimally when they are not adequately trained and informed. Successful organizations reduce anxiety by properly training their teams and establishing a culture of knowledge and excellence.
References
[1] Byrne, Donn. 1974. An Introduction to Personality: Research, Theory, and Applications. Prentice-Hall Psychology Series.
[2] Appelo, Jurgen. 2011. Management 3.0: Leading Agile Developers, Developing Agile Leaders. Addison-Wesley Signature Series.
[3] Aiello, Bob and Leslie Sachs, Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley Professional, 2011.
[4] Aiello, Bob and Leslie Sachs, Agile Application Lifecycle Management – Using DevOps to Drive Process Improvement, Addison-Wesley Professional, 2016.

Behaviorally Speaking – Building Reliable Systems

Behaviorally Speaking

+

-

June 26, 2016

0

Behaviorally Speaking – Building Reliable Systems
by Bob Aiello

Anyone who follows technology news is keenly aware that there have been a remarkable number of high profile system glitches over the last several years, at times, with catastrophic results. Major trading exchanges both in the US and in Tokyo have suffered serious outages that call into question the reliability of the world financial system itself. Knight Capital group has essentially ceased to exist as a corporate entity after what was reported to be a configuration management error that resulted in a one-day 440 million dollar loss. These incidents highlight the importance of effective configuration management best practices and place a strong focus on the need for reliable systems. But what exactly makes a system reliable and how do we implement reliable systems? This article describes some of the essential techniques necessary to ensure that systems can be upgraded and supported while enabling the business by providing frequent and continuous delivery of new system features. Mission critical and enterprise-wide computer systems today are often very complex with many moving parts and even more interfaces between components that present special challenges even for expert configuration management engineers. These systems are getting more complex as the demand for features and rapid time to market provides unique challenges that many technology professionals could not have envisioned even a few years ago. Computer systems do more today and many seem to learn more about us each and every day, evolving into complex knowledge management systems that seem to anticipate our every need. High frequency trading systems are just one example of complex computer systems that must be supported by industry best practices that can ensure rapid and reliable system upgrades and implementation of market driven new features. These same systems can result in severe consequences when systems glitches occur, especially as a result of a failed systems upgrade. Finra is a highly respected regulatory authority that has recently issued a targeted examination letter to ten firms that support high frequency trading systems. The letter requests that the firms provide information about their “software development lifecycle for trading algorithms, as well as controls surrounding automated trading technology” [1]. Some organizations may find it challenging to demonstrate adequate IT controls, although really the goal should be for implementing effective IT controls that help ensure systems reliability. Many industries enjoy a very strong focus on quality and reliability.

A few years ago, I had the opportunity to teach configuration management best practices at an NITSL conference for nuclear power plant engineers and quality assurance professionals. Everyone in the room was committed to software safety including reliable safety systems. In the IEEE, we have working groups which help update the related industry standards that help define software reliability, measures of dependability and safety. Make sure that you contact me directly if you are interesting in hearing more about participating in these worthwhile endeavors. Standards and frameworks are valuable but it takes more than just guidelines to make reliable software. Most professionals focus on the importance of accurate requirements and well written test scripts which are essential, however not sufficient to really create reliable software. What really needs to happen is that we build in quality from the very beginning which is an essential teaching that many of us learned from quality management guru W. Edwards Deming [2].

The key to success is to build the automated deployment pipeline from the very beginning of the application development lifecycle. We all know that software systems must be built with quality in mind from the beginning and this includes the deployment framework itself. Using effective source code management practices along with automated application build, package and deployment is only the beginning. You also need to understand that building a deployment factory is a major systems development itself. It has been my experience that many CM professionals forget to build automated build, package and deployment systems with the same rigor that they would a trading system. As the old adage says, “the chain is only as strong as its weakest link” and inadequate deployment automation is indeed a very weak link.

Successful organizations understand that quality has to be a cultural norm. This means that development teams must take seriously everything from requirements management to version control of test scripts and release notes. Organizations that take the time to train and support developers in the use of robust version control solutions, automated application build languages such as Ant, Maven, Make and MSBuild. The tools and plumbing to build, package and deploy the application must be a first class citizen and fundamental component of the application development effort.

Agile development and DevOps are providing some key concepts and methodologies for achieving success but the truth is that every organization has its own unique requirements, challenges and critical success factors. If you want to be successful then you need to approach this effort with the knowledge and perspective that critical systems are complex to develop and also complex to support. Building the automated deployment framework should not be an afterthought or an optional task started late in the process. Building quality into the development of complex computer systems requires what Deming described in the first of 14 points as “create constancy of purpose for continual improvement of products and service to society” [2].

We all know that Nuclear power plants, medical life support systems and missile defense systems must be reliable and they obviously must be upgraded from time to time – often due to uncontrollable market demands. Efforts by responsible regulatory agencies such as Finra are essential for helping financial service firms realize the importance of creating reliable systems. DevOps and configuration management best practices are fundamental to the successful creation of reliable software systems. You need to start this journey from the very beginning of the software and systems delivery effort. Make sure that you drop me a line and let me know what you are doing to develop reliable software systems!

.

[1] http://www.finra.org/Industry/Regulation/Guidance/TargetedExaminationLetters/P298161
[2] Deming, W. Edwards (1986). Out of the Crisis. MIT Press
[3] Bob Aiello and Leslie Sachs, Configuration Management Best Practices: Practical Methods that Work, Addison-Wesley Professional, 2011
[4] Bob Aiello and Leslie Sachs, Agile Application Lifecycle Management – Using DevOps to Drive Process Improvement, Addison-Wesley Professional, 2016

Could Your Airplane Safety System Be Hacked?

News

+

-

June 25, 2016

0

Could Your Airplane Safety System Be Hacked?
by Bob Aiello

Flying by and large is considered to be one of the safest modes of transportation with industry regulatory authorities and engineering experts working together to establish the safest and most reliable technology possible. However, the aviation industry itself came under fire last year when, according to a published report, security researcher Chris Roberts divulged that he had hacked the in-flight entertainment system, or IFE, on an airplane and overwrote code on the plane’s Thrust Management Computer while aboard the flight. According to the article published in wired.com, Roberts was able to issue a climb command and make the plane briefly change course. The FBI responded by issuing a warrant for his arrest which according to published reports stated “that he thereby caused one of the airplane engines to climb resulting in a lateral or sideways movement of the plane during one of these flights,” FBI Special Agent Mark Hurley wrote in his warrant application. “He also stated that he used Vortex software after comprising/exploiting or ‘hacking’ the airplane’s networks. He used the software to monitor traffic from the cockpit system.”

Roberts is not the only person reporting that inflight wifi lacks adequate security controls with another journalist reporting that his personal machine had been compromised via the onboard wifi which was determined to have very weak security.

The most important issue is whether or not the vulnerable wifi systems are connected to the onboard safety and navigation systems or is there is a proper network segregation, which protects the onboard safety and navigation systems from being accessed via a compromised inflight entertainment system. The good news is that U.S. aviation regulators have teamed up with their European counterparts to develop common standards aimed at harnessing wireless signals for a potentially wide array of aircraft-safety systems. Their goal is to make widespread use of wifi and reduce the amount of physical wiring required, but an essential byproduct of this effort could potentially be better safety standards.

The Wall Street Journal article goes on to say that nearly a year after Airbus Group SE unsuccessfully urged Federal Aviation Administration officials to join in such efforts, Peggy Gilligan, the agency’s senior safety official, has set up an advisory committee to cooperate with European experts specifically to “provide general guidance to industry” on the topic.

Network segregation can certainly be improved, but the real issue is that software onboard an aircraft should be built, packaged and deployed using DevOps best practices which can ensure that you have a secure trusted application base. Let’s hope that the folks writing those standards and guiding the industry are familiar with configuration management and DevOps best practices or at least involve those of us who are. See you on my next flight!

JFrog Showcases Repository and Distribution Platform at DockerCon 2016

News

+

-

June 24, 2016

0

SEATTLE, WA– according to a press release published by Marketwired, JFrog announced on Jun 20, 2016 that it would showcase JFrog Artifactory and JFrog Bintray, its repository and distribution platform, at DockerCon 16, which took place June 20-21 in Seattle. JFrog also presented a session on Docker container lifecycles. Additionally, the company will lead a webinar on JFrog Artifactory and JFrog Bintray on July 7, followed by a second webinar covering JFrog Xray on July 14.

DockerCon is the community and industry event for makers and operators of next-generation distributed apps built with containers. The two-and-a-half-day conference provides talks by practitioners, hands-on labs, an expo of Docker ecosystem innovators and valuable opportunities to share experiences with peers in the industry.

“JFrog has been a valuable partner in supporting and expanding our developer community,” said David Messina, vice president of marketing at Docker. “JFrog’s universal repository and distribution solution not only supports Docker technology, but also the open ecosystem of software development tools, and we are glad to be working together to solve the industry’s biggest challenges.”

Behaviorally Speaking: DevOps In the Enterprise

Behaviorally Speaking

+

-

June 24, 2016

0

Behaviorally Speaking: DevOps Across the Enterprise
by Bob Aiello

Agile software development practices are undeniably effective. However, even the most successful companies can face challenges when trying to scale agile practices on an enterprise level. Two popular approaches to implementing agile across the enterprise are Scott Ambler‘s Disciplined Agile 2.0 and Dean Leffingwell‘s Scaled Agile Framework also known as SAFe. These approaches may work well for agile, but how do we implement DevOps across the enterprise?

DevOps is going through many of the same growing pains. Many small teams are very successful at implementing DevOps, but trying to implement DevOps best practices on an enterprise level can be very challenging. This article will help you understand how to be successful implementing DevOps across the entire Enterprise. Agile development and DevOps focus on a number of important principles, including focusing on individuals and interactions over processes and tools, prioritizing working software over volumes of documentation, valuing customer collaboration over just contract negotiation as well as responding to change over following a plan. All these practices are familiar to anyone who adheres to the Agile Manifesto, DevOps and agile development share a lot more in common than just a set of principles. Agile development usually requires rapid iterative development, generally using fixed timebox sprints. Agile development highlights the value of rapid and effective application development practices that are fully automated, repeatable and traceable. It is no surprise then that DevOps has been especially popular in agile environments.

DevOps focuses on improved communication between development and operations with an equally essential focus on other stakeholders such as QA. DevOps at scale may require that you consider organizational structure and communicate effectively with many levels of management. You can expect that each organizational unit will want to understand how DevOps will impact them. You may have to navigate some barriers and even organizational and political challenges. There are other key requirements that often come when we consider scaling best practices.

The first consideration is that larger organizations often want to establish a centralized support function which may be established at the divisional level or could be a centralized corporate wide entity. This may mean that you have to establish a corporate budget and align within the corporate structure and obviously the culture too. You may be required to create corporate policies, mission statement and also project plans consistent with other efforts of similar size and effort. Even just purchasing tools which are essential for effectively implementing DevOps may require that you adhere to corporate requirements for evaluating and selecting tools. I have seen some of my colleagues become frustrated with these efforts as they felt that they already knew which tools should be implemented while organizations usually want to see a structured tools evaluation with transparency and participation by all stakeholders. These efforts, including a proof of concept (POC) can help to overcome resistance to change that often can be seen in larger efforts to implement any best practice, including DevOps. My own approach is to pick a pilot project to handle correctly – right from the beginning that has high visibility within the organization. In practice, I have often had to juggle day-to-day duties supporting source code management or automated application build, package and deployment. With a challenging “day-job” it can be difficult to also have a “star” project to show the value of doing things the best way from the beginning, but this is exactly how to get started and demonstrate the success that the organization can enjoy enterprise-wide once you attain stakeholder buy in.

Once the pilot project has been shown to be successful it is time to consider rolling out DevOps throughout the enterprise. For DevOps practices to be effective in the enterprise they must be repeatable, scaleable, and fully traceable. An important consideration is establishing a support function to help each of the teams with training (including tools) and ongoing support. The implementation of these practices must adhere to all of the corporate policies and align with the organizational structure and culture. DevOps also must address the requirements of the organization’s technology platform. In practical terms, this usually brings me right into the cloud.

Most large organizations are embracing cloud based technology and cloud based development. Implementing DevOps must also include support for the cloud in several very important ways. Provisioning servers in the cloud is a very initial step that allows DevOps to truly show its value. In fact, managing cloud-based development is much more difficult without the improved communication and deployment automation that has become synonomous with DevOps. DevOps in the enterprise does require some specific organizational skills. This must include an understanding of the organizational and structural requirements that are essential in implementing DevOps in the enterprise.

Docker Announces Docker Engine 1.12 With Container Orchestration

+

-

June 22, 2016

0

Built-in orchestration features Announced

DockerCon – Seattle, Wash. – June 20, 2016 – Docker announced Docker Engine 1.12 with built-in orchestration, a powerful combination that provides Developers and IT Operations with a simplified and automated experience to deploy and manage Dockerized distributed applications – both traditional apps and microservices – at scale in production. By adding this additional intelligence to Docker Engine, it becomes the orchestration building block, creating a model for engines to form a self-organizing, self-healing pool of machines on which to run multi-container distributed applications. When integrated into Docker Engine, these new capabilities optimize ease of use, resiliency, performance-at-scale and security – all key requirements that are missing in other orchestration systems. As a result, organizations can be assured that their dev and ops teams are aligned on unifying the software supply chain to release applications into production more rapidly and frequently.

Read the complete press release

Securing The Trusted Base

DevOps Best Practices

+

-

June 22, 2016

0

Securing The Trusted Base
By Bob Aiello

Over the last several years, there have been many reported incidents where hackers have attacked banks, government agencies and financial services firms. Corporate security experts are hard-pressed to react in a timely manner to each and every attack. DevOps provides many techniques, including application baselining, build, release and deployment engineering, which are essential for detecting and dealing with these problems. This article discusses how to use CM Best Practices to establish secure application baselines which help verify that the correct code is running in production. Just as importantly, this fundamental practice enables IT personnel to discover if there are any unauthorized changes, whether they be caused by unintended human error or malicious intent.

Ken Thompson won the coveted ACM Turing award in 1983[1] for his contributions to the field of computing. His acceptance speech was entitled “Reflections on Trusting Trust” [2] and asked the question “to what extent should one trust a statement that a program is free of Trojan horses?” After discussing the ease with which a programmer can create a program that replicates itself (and could potentially also contain malicious code), Thompson’s comments highlight the need to ensure that we can create secure trusted application baselines. This article will help you get started delivering systems that can be verified and supported, while continuously being updated as needed.

The secure trusted application base needs to start with an operating system that is properly configured and verified. Deploying applications to an untrusted platform is obviously an unacceptable risk. But applications themselves also need to be built, packaged and deployed in a way that is fully verifiable. The place to start is baselining source code in a reliable version control system (VCS) that has the capability to reliably track the history of all changes. A common source of errors is missing or ill-defined requirements, so traceability of requirements to changesets is fundamental. Baselines provide for the management of multiple variants (e.g. bugfixes) in the code and, more importantly, the ability to reliably support milestone releases without having to resort to heroic efforts to find and fix your code base.

The application build itself must be fully automated with each configuration item (CI) built with an embedded immutable version ID; these secure IDs facilitate the physical configuration audit, essential for ensuring that the correct code can be verified as having been successfully being deployed. The CIs should themselves be packaged into a container which is then cryptographically signed to verify the identity of the source and also verify that the container has not been compromised. The only way to get this right is to build the package in a secure way and deploy it in an equally secure manner. Deming was right when he pointed out that quality must be built in from the beginning and this is a fundamental aspect of the application build used to deploy a secure trusted base. The deployed application must be baselined in the runtime environment and constantly verified to be free of unauthorized changes. This approach fundamentally provides a runtime environment that is verifiable and actually facilitates the automated application deployment pipeline. The secure trusted base must be frequently updated to provide new functionality and also be capable of retiring obsolete or unwanted code. Creating the automation to reliably build, package and deploy code on a constant basis is essential to implementing a secure trusted base. While you cannot test quality into a system, building systems that are fully verifiable is very important. The deployment pipeline itself must be designed to be fully testable itself, so that any automation issues are immediately discovered and resolved.

Complex systems today also have many interfaces that must be understood and verified. This may involve checking the feed of market data or simply ensuring that correct ports have been provisioned and remain available for interprocess communication. In fact, monitoring the application environment is another crucial aspect of ensuring the secure trusted base. You do not want unnecessary ports opened, possibly resulting in a security risk, and you also do not want application issues due to a port, required by the application system, being closed. There are similar procedures to provision and verify the operating system
itself.

In the secure trusted base, we automate the build of all code components and embed identifiers into the code to make it possible to audit and verify the configuration. We also provide a reliable packaged container that cannot be compromised and we deploy in a way that is fully verifiable. Most importantly, we ensure that unauthorized changes can be detected and then remediated if they do occur. This may sound like a lot of effort and it actually does take a lot of work, but the costs of failing to take these preventable measures are frequently much higher as those firms who have been caught with “their code down” have painfully learned. The DevOps approach enables the development of these automated procedures from the very beginning of the application lifecycle which is really the only viable approach. More importantly, the DevOps approach to creating an automated deployment pipeline enables you to rapidly build, package and deploy applications using the same procedures in every environment.

My career has often been focused on joining a team that is having problems releasing their code. The first thing that I do is push for more frequent releases and, whenever possible, smaller in scope. We automate each step and build in code to test and verify the deployment pipeline itself. The codebase that can be quickly and reliably deployed is inherently more secure. Ideally, you want to start from the very beginning of the software and system development effort. However, more often than not, I have had to jump onto a moving train. Taking an iterative approach to scripting and automated the application build, package and deployment will help you create a secure and reliable trusted application base!

1. http://amturing.acm.org/award_winners/thompson_4588371.cfm
2. https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf

Behaviorally Speaking – Process and Quality

Behaviorally Speaking

+

-

June 22, 2016

0

Behaviorally Speaking – Process and Quality
By Bob Aiello

Configuration Management places a strong focus on process and quality so sometimes I am shocked to learn that CM experts know so little about the underlying principles of CM that so clearly impact process and quality. I find this to be the case in organization whether they embrace agile methodologies or and so-called non-agile approaches such as waterfall. While many of my colleagues are expert build, release or deployment engineers, still there are those who do not understand the underlying principles that focus so heavily on process and quality. This article will help you enhance your existing CM Best Practices by applying the core principles that deliver excellent process and quality. We will also take a moment and make sure that we get our terminology straight.

So what exactly does it mean to have a process? There are many times when I hear colleagues say, “well our process is …” while referring to a set of activities that I would never really consider to be a “process”. My favorite authoritative resource is the free IEEE online dictionary called Sevocab [1] which describes a process
as a “set of interrelated or interacting activities which transforms inputs into outputs.” Sevocab also notes that the term “activities” covers use of resources and a process may have multiple starting points and multiple end points. I like to use Sevocab because it also notes the standard (e.g. IEEE) or framework (e.g. ITIL) where the term is used which can be very helpful for understanding the term within a specific context. I usually describe a process as a well defined way of doing a particular set of activities and it is worth noting that a well defined process is implicitly repeatable.

In configuration management, we often focus on creating processes to support the application build, release and deployment engineering. It is essential that all processes be automated or at least “semi-automated” – which means that the script proceed through each step, although perhaps requiring someone to verify each
step based upon the information on the screen. Scripting the entire release process ensures that process is repeatable and that errors are avoided. An error free process also helps to also ensure that we achieve a quality result. Manual procedures will always be sources of errors and mistakes. Automating the step, even if you need some human intervention will go a long way towards improving your process and quality.

Sevocab defines Quality as the degree to which a system, component, or process meets specified requirements and the ability of a product, service, system, component, or process to meet customer or user needs, expectations, or requirements. Configuration Management principles help to ensure quality. Here are
some of the guiding principles for source code management[2]:

1. Code is locked down and can never be lost
2. Code is baselined marking a specific milestone or other point in time
3. Managing variants in the code should be easy with proper branching
4. Code changed on a branch (variant) can be merged back onto the main trunk (or another variant)
5. Source Code Management processes are repeatable
6. Source Code Management provides traceability and tracking of all changes
7. Source Code Management best practices help improve productivity and quality

Regardless of the version control tool that you are using these principles will help you manage your code better and the result is better quality. Here are some principles as they relate to build engineering[2]:

1. Builds are understood and repeatable
2. Builds are fast and reliable
3. Every configuration item is identifiable
4. The source and compile dependencies can be easily determined
5. Code should be built once and deployed anywhere
6. Build anomalies are identified and managed in an acceptable way
7. The cause of broken builds is quickly and easily identified (and fixed)

You will find principles for each of the core configuration management functions in my book on Configuration Management Best Practices [2], but the ones listed above for source code management and build engineering will help you get started improving both process and quality.

cmbestpractices

Process and Quality are essential for the success of any technology development effort. Implementing the core configuration management best practices of source code management, build engineering, environment management, change control, release management and deployment will help you successfully develop
software and systems while maximizing productivity and quality!

References
1. www.computer.org/sevocab
2. Aiello, Robert and Leslie Sachs. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley, 2011

Does DevOps Need Agile?

Call for Participation IEEE P828 – Configuration Management

The IEEE P2675 Standard for DevOps: Building Reliable and Secure Systems…

Video: Understanding DevOps Culture

Does DevOps Need Agile?

How to prevent global system outages

So How Are Industry Standards Created?

The Standards Creation Lifecycle

IEEE SRE Standards Working group

CM Best Practices: Practical Methods that Work in the Real World

How to Fix Change Control – Understanding DevOps’ Secret Weapon

Training – Configuration Management, DevOps and Agile ALM

Training Video:: Bob Aiello On CM Best Practices

Behaviorally Speaking: Creating the Deployment Pipeline

Behaviorally Speaking: DevOps Development

Behaviorally Speaking: Dysfunctional Ops

Behaviorally Speaking – CM Excellence

Behaviorally Speaking—Release Management and Deployment Essentials

Contact Us

How to Maintain 143 Million Customers

How DevOps Could Have Saved Equifax

How DevOps can eliminate the risk of ransomware and cyber attacks

Predictions for the Coming Year

Enterprise DevOps and Microservices in 2017

Monitoring Your Environment

ALMtoolbox presents smart performance monitoring and alerting tool, including Free Community Edition

Personality Matters- Anxiety and Dysfunctional Ops

Behaviorally Speaking – Building Reliable Systems

Could Your Airplane Safety System Be Hacked?

JFrog Showcases Repository and Distribution Platform at DockerCon 2016

Behaviorally Speaking: DevOps In the Enterprise

Docker Announces Docker Engine 1.12 With Container Orchestration

Securing The Trusted Base

Behaviorally Speaking – Process and Quality