Tuesday, December 6, 2016

Techpost - The Test-Maintain Loop saved me during a Jenkins Server infrastructure upgrade

Technical Post (just a warning for the non-technical follower).

As some of you know, earlier this year I created a first version of an Ansible Playbook Test framework and put it on Ansible Galaxy.  

Yesterday, this tool saved me from creating a disaster on my Production Jenkins CI instance on AWS (Amazon Web Services). 

I'd like to share that story as an example for others of what is possible if you include testing as part of your Infrastructure as Code strategy.

How I was saved yesterday:

(some details removed for security and simplicity reasons).

Execute the Playbook...
ansible-playbook -i Inventory/CASPAR/staging/ CASPAR_setup_jenkinsservers.yml -u ubuntu --ask-vault-pass --ask-sudo-pass
The result is a new AWS instance server with Java 8 loaded,  appropriate users and groups setup and a base Jenkins Image loaded.  The key here is SETUP (the minimum needed to get a server into MAINTAIN status. The server is in "Staging".

Then, the repetitive playbook is executed (this one runs on a regular basis to keep servers up-to-date in all environments. In my case, I execute for Dev/Staging and Production ( the same playbook is used and applied to all 3 environments ).

ansible-playbook -i Inventory/CASPAR/staging/ CASPAR_maintain_jenkinsservers.yml -u ubuntu --ask-vault-pass --ask-sudo-pass

Note: The only difference in the name is the word "maintain". 

Note: To run the same playbook in Prod or Dev, I simply run the same playbook against /CASPAR/prod/  ( Ansible's dynamic Inventory auto-finds the appropriate machines based on tags )

Then, while the machine is still in Staging, the following test playbook is executed....
ansible-playbook -i Inventory/CASPAR/staging/ CASPAR_test_jenkinsservers.yml -u ubuntu --ask-vault-pass --ask-sudo-pass

The "test" playbook executes all predefined tests to ensure the server is in good shape. 
If there are no errors, all that needs to happen is for the machine to be re-tagged in AWS from Staging to Prod and then the next time the maintain playbook is executed, it will have any appropriate changes using the SAME playbook as before (different Gateway addresses, different database connection string, etc).

Then of course, the TEST playbook is executed again (one final test).. 

The Test playbook can now also serve as a Governance check playbook as well and could be executed by the same team or externally where needed. It provides a means for safer, more comfortable changes, while also providing a build-in governance component if needed.

Yesterday, when I ran the tests in Staging I received an error about missing packages.

ansible-playbook -i Inventory/CASPAR/staging/ CASPAR_test_jenkinsservers.yml -u ubuntu --ask-vault-pass --ask-sudo-pass > test.log
grep "TEST_PASSED" test.log
grep "TEST_FAILED" test.log

I received the message...

 "msg": "TEST_FAILED: package xxxxxx expected present "

(xxxxxx is a hidden package only for this post for security reasons).

If I had converted the host to Production, it would have caused big problems in my production environment.

After doing some research, If found that I had previously requested a newer version of an AMI ( an Amazon Machine Image ). 

Although the entire "setup" and "maintain" playbooks ran flawlessly (with no errors), what I did not know was the newer AMI was missing a critical Operating System package that my environment needed.

I modified the "maintain" playbook to include the missing package, re-ran the "maintain" playbook and then re-ran the Test Playbook.  Everything passed. Now, I know the Staging and Prod machines will always be up-to-date with this package when the "maintain" playbook runs it's continuous loop.

The new Jenkins Server was tagged as "Prod" and then the previous server deleted from AWS. The transition was painless.

By taking this approach and adding new checks to my server first as they become evident, I ensure that I will  not deploy something to production that has not already been determined to be a potential problem. 

I will no longer have this issue or one related to missing this package again.  If an image contains the missing image, no problem.. It will simply pass. Ansible does not re-install packages if they already exist (unless "latest" is specified in the version").

Brief History of the Test/Maintain/Govern Loop

The purpose of creating the Test/Maintain/Govern Loop for Playbooks was to show a Test-First approach to infrastructure delivery to make the transition to Infrastructure as code easier to get accustomed to.

The approach uses knowledge taken from years of insight from the software development world in delivery of complicated environments and applies it to the Infrastructure as Code domain.  

Technical Notes:

Jenkins CI server running in Production on AWS.  

Ansible Playbook uses to Setup/Maintain and Test server(s) in both Staging, and Production. ((how my environment works for build servers.. TODAY).

In AWS, tags are used to determine if a machine is "in production" or "in staging". They are both live in AWS in the same VPC (A VPC is like a private IP range within AWS for my hosts to reach each other).

Playbooks are formatted into YAML (a markup format) to have Dev/Staging/Prod in the same playbook.

A unique matching approach allows the same Playbook to run many times in Dev and Staging. This helps to ensure that when the Playbook runs on the Production machine, it has already executed many hundreds time already (and confirmed correct).

An often missing catch with playbooks is that "If" statements can be used to determine of parts of playbooks are executed.  A playbook command can be set to only run a certain instruction only IF a certain environment exists (an example).

When I want to upgrade my Jenkins server or reconfigure a new one, I take an approach of.. "Build a new one, run the setup/maintain and test on it, and IF everything is OK, move it into the Production Tag and then disable the older server. This allows me to ensure all is well before activating a new production change.  

Think of the saying ....  

"All Servers Are Temporary"

If you feel that your organization could benefit from learning about a Test First approach to Infrastructure, please feel free to reach out to me.  I provide 1/2 day or full day sessions in the Toronto area or full-day sessions plus expenses anywhere else worldwide. 

A link to the original presentation be found here..... 

If you are so inclined, here's a link to the root repository... 

A sample "test" (also used for governance) playbook is located here...    

Monday, November 28, 2016

Change can be fun or exciting

Over the last few weeks, I have seen a repeating theme in my social media feeds.

That theme... "Change is Hard" 

In my lifetime, I have been part of change, both positive and negative. 

Some of that change was imposed on me from above, and some from market forces. In some cases, personal interactions create change. My reaction to these different situations is very different for sure.

If you are a person who helps others to embrace or live through change (whatever your interpretation of change is)....

... consider the damage you are causing by inspiring fear where it simply may not be appropriate or necessary.

I can say from both personal and professional experience...

Change does not have to be hard.

It can be fun or exciting!

Please stop giving the impression that hard change is mandatory.

Sunday, November 13, 2016

Follow Rachel Perry as she learns to use Scrum to deliver a product to improve her community

Today I want to share an inspiring story about a person who is learning to apply agile values and principles through experimentation in both personal and business aspects of her life.

My first sign of this was from a post Rachel Perry made about trying to deliver a newsletter for the company she works for with her team. 

She took her understanding of Scrum to experiment with an approach to delivering a weekly newsletter using "Sprints" and a focus on learning. She blogs about it here at agileadvice.com

For those of you that are interested in using Scrum in non-IT environments, it's worth following some of Rachel's learning as she progresses in her journey.

She is also developing social service product
to help communities. Read about it here.

If you would like to learn more, I'm sure she'd appreciate you reaching out to her to learn more (or help in your community).

Good luck Rachel as you progress on your journey.

Wednesday, November 2, 2016

An agile approach helps educators see better financial results to support their students

Someone pointed out today that forgot to post a reference to an article for my usual blog followers... 

Image (c) Blueprint Education, 2016
These two excerpt are from an article published on the Scrum Alliance Member articles section on Aug 24, 2016. It talks a bit about how a new approach can help the business side of charter schools. 
"Central to this turnaround effort was the creation of an Agile culture."

"The discipline of focus produced great results. Relationships also grew deeper from working together through the discouraging times. The result is that the team has developed new competencies, along with an increased confidence." 

If you have an interest in some other posts from this blog about this topic, here are two searches that will give you some more reading material.

link - Agile in Education

Sunday, October 30, 2016

Workshop notes Co-located vs. Distributed Teams and the Agile Manifesto

I recently had the pleasure of presenting a new workshop about co-located versus distributed teams for the Halton Agile-Lean Network in Oakville in collaboration with Nick Norbeck.

We ended up doing this session as a result of a question about some of the principles of the agile manifesto seeming to be in contrast to what people see happening in companies today.

The session involved myself acting as a business person who had a specific market segment and a ton of money to bring it to market (with limited patience ;->).

The attendees spent time working as co-located teams and then an engineered situation where teams would be asked to work on the same product but from different locations and time zones.

When we were done, we asked everyone to note Differences between the first and second round (co-located versus distributed).

As is my style, the question was rather vague. This allows people to bring their own interpretations as to what was important for them. 

Here are the resulting comments from the workshop (in no particular order). I used the exact same Case (capitals and smalls) to keep as close as possible to the originals.  

According to our two teams there were differences between co-located and distributed in the following ways....


  • Anxious
  • Energy Dropped with Distributed Team
  • Positive
  • Positive Energy
  • Increasingly Positive
  • Focus
  • R1: calm, R2: anxious
  • Smaller group -> more energy per person required


  • Communication
  • 1 - more relaxed, 2 less relaxed
  • distributed. uncertainty about integration -> communication
  • more personal
  • collaboration
  • First round felt safer - ALL IN
  • Communication
  • Collaboration
  • more pressure from PO when group is smaller


  • A Stronger Sense of Urgency in the 2nd Round
  • Dedication
  • Strengths & Skills
  • Less connection
  • Frantic/rushed when we got together
  • missing info/stuff
  • :-( other team mates didn't care about other team
  • Teams are much harder to collaborate
  • Skill Matrix is important


  • Clarity
  • Communication Breakdown
  • Quality focused
  • Better definition
  • Webex or Video to Run through sprints
  • just get on with it
  • pre-agreed process or agreement
  • a bit less thinking and or planning in 2nd round

Thank you to the awesome participants of this experiment. It was fun to see the excitement in the room.

Also, thank you Nick. It was great to work with you on this.

References and links

Agile Manifesto Principles

Halton Agile-Lean Network

Join the Halton Agile-Lean Network (anyone in the Halton region with interest is welcome)

Nick Norbeck

Workshop - Co-located vs Distributed Teams

If you think you might benefit from a customized workshop for teams, organizational change or technical practices, feel free to reach out at http://www.caspar.com/

Thursday, October 13, 2016

Impossible versus Improbable .. and trust.

A situation I observed recently reminded me of the power of words in building (or destroying) trust. 

- or -

Some examples...

It is impossible for the database to be hacked because It is improbable that the database can be hacked because
It is impossible to sell (x) to (population) because It is improbably that we will sell (x) to (population) because
It is impossible to help because It is improbable they will accept our help because
It is impossible that no one will like this feature because It is improbable that this feature will fail because
It is not possible to make this work because It is improbable that we will get this to work because

Ask yourself...

Which version of these comments might build or destroy trust?

On a somewhat related thought path...

Which version of these comments might encourage or discourage...
  • dialogue
  • collaboration
  • teamwork
  • an open mind
  • a project schedule
  • excellence

Consider your own experiences and conversations lately, and ask yourself.. Did I use the correct word to describe the situation?

Mike Caspar

Tuesday, October 11, 2016

The We versus Them Assessment

A friend asked me to re-post this article here. It was originally posted on linkedIN on June 20, 2016. Here goes...

. . . . . . . . . . .

This weekend I had a remarkable experience that really got me thinking..

I was talking with a fellow member of the agile community (we'll call her Alice).

Alice was telling me about why she liked the company she works for.

What truly stuck out for me was the fact that she continually referred to the company's groups as "We".  For whatever reason, I don't see this word used often when agile coaches, Scrum Masters or change agents talk about the companies where they are engaged (or work at). 

This got me thinking..

I wonder if you could assess the health of a company's teams or culture by simply listening to conversation and keeping track of the number of times people use the word We versus the word Them?

There is a possibility that this approach would also allow a coach to quickly notice where the language changes to Them to relate to other internal departments to discover where there might be room to help with interactions.

The simplicity of this seems like it may a totally bogus idea but an idea isn't worth having unless it's shared (and possibly criticised).

Of course, I hope someone doesn't try and use this idea in some horrible, inappropriate way. You never know though... Could the approach remove countless survey questions and senseless sessions?  The question jumps directly to People and Interactions.

This whole topic encouraged me to ask that you run a personal experiment (for yourself).

If while you were reading this, you have been wondering about you and other departments in the company you work at..

Have you considered...

"Are your customers....  We or Them ?" 

Just a thought.

To Alice (you know who you are)...

Thanks for the inspirational chat. <smile>