This week we take a look at Continuous Integration tooling, with a particular focus on the Concourse open source project. We will review the key design principals behind this software that makes it so attractive to developers, including specifically to our Pivotal Cloud Foundry developers around the globe.
- Subscribe to the feed
- Feedback: firstname.lastname@example.org
- Links Referred to in the Show:
Welcome to the Pivotal Perspectives podcast. The broadcast at the intersection of Agile, Cloud and Big Data. Stay tune for regular updates, technical deep dives, architecture discussions and interviews. Now lets join Pivotal’s Australia and New Zealand, CTO, Simon Elisha for the Pivotal Perspectives podcast.
Hello everyone and welcome back to the podcast, fantastic to have you back. Simon Elisha here, as always, just returned from some vacation. Switched off for a couple of weeks all together, that’s no email, no phone; I can highly recommend it if you have not done it recently. There is a whole other world out there, other than our take. But, right back into it and for today’s episode, I thought it would be really interesting to explore a little bit about continuous integration pipelines. Some of the tooling around that, in particular a open source solution called Concourse.
So Concourse is an open source project, which of course we love here, at Pivotal. You can find it at http://concourse.ci. That’s C O N C O U R S E dot C I. So think of the airline or airport metaphor of concourses and flying and stuff like that and that’s kind of the thing you will get to. And this is a project specifically focused around continuous integration and this is something we have practiced intensely in the projects that we’re working on within Pivotal obviously.
Now let me step back for a minute and just define some terms and then we will get into the details of Concourse itself. So Continuous Integration is of course the process of bringing change together within your software development team on a regular basis. So if you think about what happened in the past or might happen still on some of your own projects depending on how fast you are with your software development mechanisms is … A developer would go off and happily develop their code in isolation and then try and jam it into the other code in some way. This could inevitably break things, causing incompatibilities, cause testing problems, etc.
In a world where we are trying to move much more quickly, we are trying to be agile, we are trying deliver change very quickly into production that approach doesn’t work. So one of the fundamental tenents, if you like, of being agile is frequent small changes. So very small batch sizes doing little things on a regular basis. But to do that you need to have a degree of automation and tooling that a lot of waterfall based software environments don’t have. So this means you need to be able to have a mechanism by which the local developer can firstly develop work on their laptop, run some basic tests, etc. Then commit those changes into some sort of source control, so Git, Subversion or whatever your religion is and then have those changes tested appropriately to make sure they are in integrations, to also do any adjustments or modifications required for integrations, so you’re not breaking anything.
Then making sure that whole process goes through a series of tests before it’s available for release either into pre-production or production. So typically this called a “continuous integration pipeline.” Cause you think of it as a pipeline of tasks that have to take place and what happens over time is that this becomes a burden, a technical burden in an of itself and what you end up having is people who are kind of experts on the CI pipeline itself rather than experts on the application they’re building, this is not necessarily a good thing cause again we want to get out of the way of the things that are boring and repeatable and get on with the things that are interesting, which are great new features and functionality.
So first things first you have to have some sort of continuous integration technology. You might have heard of things like Jenkins, etc, there is a whole bunch of tools out there, some are hosted, some are not really … Each of them have their own benefits and downsides, it’s really up to you what you use. But Concourse has grabbed my attention recently and it’s grabbed my attention for a couple of reasons. Firstly it is a completely open source solution which means you can deploy as and where you like, it’s very simple and has some very simple tenets that it’s build upon that make it really suitable for; not kind of taking over your environment. It’s designed to be relatively light weight I would say, very meaningful in terms of the information it shows and very quick to get up and running.
So let me maybe talk about some of the principles that Concourse has been built with to see it kind of aligns with the way your thinking might be. So the first concept is it should be simple. It really is the anti-complexity solution. The focus is not around adding lots of features and check boxes—there are just three primitives that it uses and those three primitives are what drives the entire process. And we’ll talk about those primitives later on. But if you’re used to an environment where you sort of bring up a screen and there’s twenty-seven different check boxes you have to choose and a million different hooks, etc. This ain’t that environment.
It’s meant to be usable, so it should be very simple for you to see the information you need, to get logs you need and to visualize the work that is going on. One of the key aspects of any agile software development environment is visual communication you need those information radiators there. So one thing I always look for when I walk into a development team’s office is what’s up on the wall from a visual perspective. Often you get the Kanban board, etc, is always up there. There are post-it notes flying everywhere. But what I’m looking for would be LCD screens up on the wall showing me what’s happening in real time. How’s the build going? Is it broken? How many builds are going simultaneously? etc, etc. One of the things that Concourse does very well, is have very neat and simple graphing technique or visualization technique to show you what is going on in the build at any one time or what the relationships are with different components, which is really nice.
Another really important thing is the working components, so with any continuous integration solution there are these worker nodes that typically will be run on a cloud of some sort and these can be the bane of your existence if you use certain technologies, there is probably a few people nodding their heads or maybe banging their heads against the desk thinking about the solutions I’m talking about. In Concourse, the workers are stateless and if you’ve been listening to any of my podcasts for years now, you’ll know I’m always banging on about the importance of state-less-ness and that it makes your life easier, etc. Essentially every task that executes within Concourse, executes in a container defined by it’s own configuration. So you are not worrying about what other teams are doing with the same Concourse deployment, you’re not going to bump into each other, also it means that nothing is a dependency.
This comes into the next part of the principles; which is scalable, reproducible, deployment. Concourse deployments are not meant to be snowflakes, they are meant to be very repeatable, this is the cattle vs. pets metaphor that I’m sure you’re sick of hearing about now but it is important. Essentially Concourse is configured statically and you can always recreate it from scratch with some magic and the magic technology that underpins Concourse is BOSH. Now you may have heard me talking about BOSH in the past cause BOSH is also some of the magic that makes Pivotal Cloud Foundry possible and a wonder to maintain. BOSH also underpins Concourse which means it’s very, very effective in managing your fleet in a super simple way across multiple platforms.
Another concept is that of flexibility. So what it does is makes sure all the features of Concourse are implemented in user-land rather than in the core of the product. So this means that you can essentially be monitoring, maintaining, and adjusting the resources that you use within the system. So you’ve got things like time resources, GIT resources, Pivotal Tracker resources, Simver resources, S3 resources, Docker image resources, a whole bunch and you can create your own. This means that it is really easy to integrate into the pipeline and also really forces you to have a stateless system because everything is externalized as resources which is very clever.
It’s also the concept of having local iteration so instead of having to iterate in the general work flow, you can set up your own CI, configure your work locally and kind of get the pipeline working before you then send it across to be your public or commonly used pipeline. Very, very nice. Also everything is bootstrapped so what is provided with the open source distribution is it’s own pipeline so you can see how it’s building itself and you can use that as an example of what to do.
So essentially to get going with this you download the repo, it’s a BOSH release, it has everything you need to set it up, you can set it up on your laptop, you can set it up on infrastructure, etc so it provided examples for things like Vagrant, etc but you can do it on Amazon, you can do it on vCloud Air, you can do it on Open Stack, you can do it on a bunch of different components. And its primarily command line driven so your gonna want to use the command line pretty much and the command line name is cool, it’s The Fly CLI, F L Y. So you can see the theme that we are going off of here and this makes it really easy to use. And again you can tell that the people who worked on this have a strong Linux background because they are trying to build very simple expressive systems using very basic primitives. And the primitives are three; there are tasks, there are resources, and there are jobs.
A task is the execution of a script with dependent resources available to it. So for example, a test that you may be running. So this makes it very simple and encapsulated and you can execute a task manually or you can have a job that executes that and we’ll talk about that shortly but both of these work exactly the same, so you can test a task locally on your laptop and then send it to a job stream and have that work appropriately. Resources are any entity that can be checked for revisions or pulled to have a specific version or pushed up an out to create a new version. So very common example would be things like a git repository you can also use things like time, the concept of time as well.
Now because we have this nice generic resource interface we can extend this to do pretty much anything you want it to do. Essentially a lot of resources that are already created for you or so the community can continue to build more so you can be very flexible in how you manage your resources, etc. And then the jobs, so this is the third component, really describes an anxious to preform when dependent resources change or there is some sort of manual trigger that you want to do. So you can think of that as a function that with an input and an output that will run when your inputs are available and a job can depend on the output of upstream jobs so you can hook things together and really putting this all together is done by our build plan and this uses a really nice domain specific language can express anything from some simple unit tests to a matrix of tasks and generate a result it really really is very very powerful.
Now some of the nice things last things you can do under the covers is you can be using, of course as I mentioned BOSH to manage your environment which makes things really, really easy. Again you want to minimize the care and feeding of what’s going on. So again if you want to manage your fleet of Concourse workers and then the stateless, which is a beautiful thing, you simply configure BOSH to say how many instances you want and the magic happens and you don’t have to worry about it.
The other nice thing is that Concourse looks after, through the good agencies of BOSH, the health of the infrastructure. So it can make sure that your cluster is always in the desired state that you want, it can do heart beating it can do maintenance, it will do retroactive fixing, it will do recreation of instances. Essentially it looks after the whole thing for you. The nice thing is that you can completely blow things away if you want to and recreate it from scratch, cause again we are talking about a stateless infrastructure. So that’s a little bit about Concourse. Again, look it up, Concourse.CI, if you’re not doing some sort of fully automated continuous integration for your software development process, stop listening and go get that set up.
You’ve got to have that. Often I visit customers who are trying to move into an agile work flow and this is one of the bits that tends to be missing. Now one of the nice things about this kind of work flow is it of course feeds beautifully into deploying into a structured cloud platform and what better structured cloud platform to deploy into than Pivotal Cloud Foundry, so of course Concourse is really useful for deploying onto Pivotal Cloud Foundry and getting applications up and running and operational. What Concourse can help take care of is all the other bits that happen beforehand. The integration tests, the stress test, the compatibility tests, etc. Right to the point where you want to ship it, there is also a “ship it” task so you can push that code into production and that’s where the magic happens.
So I hope that’s been interesting, thought provoking and useful, again we love to get your feedback and your ideas, email@example.com is our email address for that and until next time keep on building.
Thanks for listening to the Pivotal Perspectives podcast with Simon Elisha. We trust you’ve enjoyed it and we ask that you share it with other people who may also be interested. And we would love to hear your feedback so please send any comments or suggestions to podcast@pivotal.IO we look forward to having you join us next time on the Pivotal Perspectives Podcast.
About the Author
Simon Elisha is CTO & Senior Manager of Field Engineering for Australia & New Zealand at Pivotal. With over 24 years industry experience in everything from Mainframes to the latest Cloud architectures - Simon brings a refreshing and insightful view of the business value of IT. Passionate about technology, he is a pragmatist who looks for the best solution to the task at hand. He has held roles at EDS, PricewaterhouseCoopers, VERITAS Software, Hitachi Data Systems, Cisco Systems and Amazon Web Services.More Content by Simon Elisha