TECH BLOG - Introduction Part 1: CI/CD Antipatterns

I'd like to start this blog with a post about CI/CD Patterns and Antipatterns. I guess all of them are in general well known but many of the Antipatterns still persist. Also, if you are looking for a quick fix, you should really check the list. If you are a newbie regarding CI/CD, then the list should at least help you to understand the fundamental concepts behind CI/CD, which will be explained in detail in future posts.

Let us start with the Antipatterns. A list of things that will keep you from successfully apply CI/CD in your company or project. Most of the Antipatterns are a bad thing, whether you go for CI/CD or not. So if one or another item on the list feels too familiar, try to change it!

Antipatterns

Slow and tedious build and deployment

Slow and unnecessary complex builds and deployments are a very bad thing. For one thing it simply wastes time that developers can use to implement features. On the other hand, when a build/deployment is slow, people tend to rather not trigger it that much. So instead of trying to deploy deliverables frequently, people avoid it. Therefore a build/deployment becomes a complicated thing no one really can handle. Builds/Deployments can be slow because...
 
  • Manual steps are necessary during the build/deployment
  • Automated steps take a long time

 

Having manual steps is even worse than a slow automated process because it leads to error-proneness and to being dependent on a few people with magical deployment skills. I worked for companies where personal favors from some operations guys were necessary to get a deployment done. So instead having a quick 3 minutes deployment, a developer had to flatter one of them first to get the deployment done. Bribing them with a cup of coffee and telling them how generous they are deploying the software worked for me...

Having everything automated is a big achievement already. But if it takes too long, it is still a problem. Here is why:

  • Developers have longer waiting time - and if they are not waiting they need to get back to it later which leads to context switching. Worst case they don't check the build at all.
  • Long builds can become a bottleneck when it hurts most - for instance if you have to fix an error in production.
  • If a build takes too long, people will trigger it less often, which increases the failure rate and makes it harder to find errors.

 

There are many ways to speed up a deployment and not every deployment for every stage of the deployment pipeline needs to be quick. For starters, try to achieve a quick build and a quick local deployment at least. There is no rule of thumb here - try to locate the actions that take a long time and speed them up. One step at a time... In general watch out for steps that download stuff from the internet (Maven Artifacts for instance) and try to run steps in parallel on multiple nodes. Also, be aware to use powerful hardware for your build server.

The most annoying thing though are deployments that sometimes work and sometimes not. I know it can take quite some time to figure out what's wrong, but it is absolutely worth the struggle. A reliable deployment is the foundation for a reliable release and a well working software. So without any further delay, such issues need to be fixed.

No or inadequate configuration management

There is a golden rule: Everything that is part of your software needs to be part of your version control system. Every single configuration, documentation, test case and script needs to be checked-in. Doing so leads to having complete tags and versions of your application, including every single item that comes with it. It is really the foundation for every further step. Having a working, fully automated deployment without a working configuration management is impossible. Jez Humbles delivers a nice overview in his book and on the website that comes with it (https://continuousdelivery.com/foundations/configuration-management)

When it comes to configuration management, we are not only talking about the configuration of the application - that's the first and most important step. But also other parts of the development chain benefit from good configuration. Which leads us to our next Antipattern:

Non reproduce-able environments

Recently I worked for a customer to help him transforming his IT towards Continuous Integration. To do so, it was necessary to create a new environment to run system tests and verify the quality of the release. To implement the test automation including service virtualization was a doable challenge and worked as expected. But creating a working new environment for the applications we had in scope was almost impossible. The knowledge how to setup a suitable infrastructure and the application got lost. In addition to that, complicated and slow operation processes made it even worse. The same issue occurred when adding new developers to the teams. Setting up the local development environment including local tests and builds was also a time-consuming and error-prone task.

As a part of Continuous Delivery, it is important to create reproduce-able infrastructure environments. The development environment, the CI/CD tool chain and the application infrastructure setup should be well documented and easily reproduce-able. When setting up new environments, it is advisable to use an "Infrastructure as Code" approach. By using configuration files and scripts to setup and configure environments, the setup can be automated. This is a pretty easy task when using cloud services like AWS, where you can define your whole infrastructure using CloudFormation Templates (https://aws.amazon.com/de/cloudformation/). If you host your own infrastructure, tools like Chef or Puppet can come in handy. Also using Docker Containers is recommendable since it supports the IaC approach well.

Continuous Integration tools can be used as a service, which makes it easy to have a working and easy to configure environment. Cloudbees (https://www.cloudbees.com/products/jenkins-cloud) offers Jenkins as a Service, AWS has its own Continuous Integration platform (https://aws.amazon.com/de/codepipeline/?nc2=h_m1). If that's not an option, you can for instance run Jenkins in Docker and deploy it wherever you want. But of course that one comes with some additional effort to set it up right. We already created a full CI/CD pipeline with Jenkins and Blue Ocean, Nexus and SonarQube using Containers deployed in AWS - it works well once it has been set up, but it was quite a challenge to get the configuration right. Also it can be quite expensive because Jenkins needs a pretty powerful machine to run smooth.

When working within the Java ecosystem, having an easy to setup development environment still seems to be a hard thing. When using Maven or Gradle to build your project, at least the IDE you use and the IDE setup isn't that important anymore. But you still need to install the right tools on your development machine and get the configuration right so that it works well. There are various projects working with cloud-based IDEs (https://codenvy.com/product/) where you can develop directly in your browser, but I haven't used one in an enterprise project yet. Also working with centralized, pre-configured development environments is an option, but it has some drawbacks when it comes to customizing and usability. I guess the best thing to do here is to provide a clean setup with scripts and a good manual on how to setup the local development environment and necessary tools. Although this is an old school approach, I haven't met a better one for Java Enterprise projects. Javascript developers love to use Vagrant (https://www.vagrantup.com/) to create their development environment and it works great - but they have different requirements. This discussion is quite useful when thinking of using Vagrant for Java Software Development (https://stackoverflow.com/questions/17625421/vagrant-for-a-java-project-should-you-compile-in-the-vm-or-on-the-host).

Mixed-up environments

Each environment (Development, Test, Integration, ...) you use during your development process or as a part of your deployment pipeline should have a well-defined purpose. When companies start to use environments for different purposes or do not have clear rules here, chaos starts to take over. The worst thing you can do is to use one environment for manual and automated tests, because this will lead to failing automated tests due to issues with the test data or to inefficient manual tests. Because you can either set up an environment to run automated tests or to provide the right setup for manual testing - both is not possible. This blog entry gives a nice, quick overview on the different stages of testing as we move along the delivery pipeline: https://www.testingexcellence.com/delivery-pipeline-agile-project/. In general, the agile testing pyramid can be a good indication of the environments necessary.

 

image2018-4-3_17-57-7.png

 

So here is the right way: For each environment, define the purpose of the environment, the test stages you want to cover and the requirements towards the setup. It makes a huge difference if you need an environment for component tests or end-2-end tests, because the latter is far more complex to maintain. The better the use of an environment is defined, the easier and cheaper it is to operate the environments. The value an environment can provide will increase and the hassle will decrease.

Tags
transform business framework agile CI/CD Knowledge management Open Space Center Big Data CRM Test Automation Continuous Testing Maturity Matrix Nearshore Service Virtualization leadership philosophy motivation sprint works Sprint backlog ECS cluster