Pivotal People—Onsi Fakhouri, The Science of Diego

September 10, 2014 Stacey Schneider

featured-ofakhouriThe New Stack posted commentary on the Docker on Diego demo we first shared at VMworld, showing off the Go (open source programming language) rewrite of Cloud Foundry‘s new container scheduler.

Onsi Fakhouri, former Berkeley physicist and product manager on the engineering team behind the Diego rewrite explained the motivating factors and philosophies at the core of Diego at the CF Summit, and has shared his views on Diego in the new multi-platform world and distributed health management before on this blog.

Now we are treated to a short Q&A. We cover Fakhouri’s background, talk about Go, refactoring, explore Cloud Foundry Dojo, and the challenges developing distributed systems.

Tell us a little bit about yourself and how you came to work on Cloud Foundry.
After graduate school in Berkeley (I studied astrophysics), I came to the conclusion that I was more of an engineer than a scientist. I happened upon a booth at Cal’s “last-minute career fair” and had a very enthusiastic and intriguing conversation with Edward Hieatt and Davis Frank. So, I interviewed with Pivotal Labs and joined as a consultant in 2010—everything I really know about writing software I learned from pairing with pivots. I joined the Cloud Foundry team just over a year ago, at first as a labs consultant, but now as a dedicated engineer. The problems we’re solving are complex and deeply interesting—more like open research problems than straightforward engineering issues. In some ways, I feel like I’ve come full circle and am doing the sort of research that I was doing in grad school—but now actually shipping working code instead of arcane papers.

You are focused on the Go refactoring of the Cloud Foundry elastic runtime. You had a great presentation about Diego, the ‘what’ and ‘why’. Maybe you can now tell us ‘how’?
To call Diego a refactor is something of an understatement. We’re *rewriting* Cloud Foundry’s elastic runtime.

“How” is an interesting question. To paint with broad brush strokes: distributed systems are hard. In particular, ensuring that a suite of disparate disconnected components can come together and perform something *useful* in the face of arbitrary failure (network partitions, latency, dropped message, etc.) is quite challenging.

Traditionally, such problems are solved by expending enormous energy laying out a detailed architecture ahead of time. The culture at Pivotal (Labs in particular) is deeply suspicious of that sort of approach, favoring collaborative incrementalism and short feedback loops over overarching, divorced-from-reality, up-front design.

Our approach with Diego has tried to strike a balance between the two. We spent a few weeks upfront understanding the scope of the domain and the problem at hand and drew up a simple vision for the broad outline of Diego’s architecture. We then filled in details incrementally as we put Diego together—pairing and test driving as we went. We also invested heavily in building a comprehensive test suite to validate that our components work together, and in organizing our code around a centralized “script” that defines the coordination between the various components.

Most importantly, we applied the same red-green-refactor cycle commonly associated with TDD to the architecture itself. We’ve invested time in stepping back and looking at the architecture as it has continued to emerge. Think of it as “architectural-incrementalism”—we’ve listened to our code (writ-large) as it’s told us which way it’s headed and our architecture has been updated in response.

Just recently, for example, we were delighted to discover that our relentless push towards separating concerns has left us with a clean delineation between where Diego’s CF-specific concerns live and where Diego’s more generic scheduling-and-running components live. There was just one gnarly codepath that broke the clean separation—once we were able to step back and see it, we refactored it away and ended up with a very nice separation. Diego now has many self-similar “Cells” that can be scaled horizontally to run and manage generic Tasks and Processes. CF support is then implemented by launching “CC_Bridge” vms that run processes that connect the CF-specific world of “apps” to the more generic world of Diego’s Tasks and Processes. The beauty of this separation is that you can run Diego without these “CC_Bridge” vms and get all the scheduling, containerization, and health management that Diego provides for CF in other domains.

Does working on an Open Source project with a lot of visibility change anything for you?
Not particularly. I personally prefer underpromising and overdelivering—and that’s a little hard to do when there’s a lot of scrutiny.

We can get better at facilitating contribution. Diego spans many repositories and it can be hard for the uninitiated to contribute back. So far, we’ve had the opportunity to spend a few weeks upfront pairing with our main contributors. I’m hopeful that github.com/cloudfoundry-incubator/diego-design-notes is a step in the right direction in terms of engineering-oriented documentation, but I think we have plenty of room to grow here.

If someone is really interested in getting involved, we have the Cloud Foundry Dojo. Dojo is an opportunity for 6-12 week immersions inside one of the Cloud Foundry engineering teams. While you are in the dojo, you’re literally part of our team. Participants should expect 40 hour weeks of pair programming, TDD, standups and retrospectives. You’ll be working on Cloud Foundry, but you’ll also get a crash course in the Pivotal Labs engineering practices.

If you are serious about becoming a full time contributor to any Cloud Foundry project, dojo might be the quick path.

There’s a lot of interest in the Cloud Foundry community about the features Diego will enable. The demos are already starting to be quite impressive. What surprises have you had working on the project? Are there things you thought would be easy or hard that aren’t?
I continue to be surprised at how hard it is to build a robust distributed system. We’ve had these comprehensive integration test suites from the start and keeping them green and not-flaky has been challenging. This is, of course, a good thing. If our tests are red, then our code doesn’t work, and we need to know. I’m just surprised at how easy it is to introduce a benign-looking change in one component that modifies the intended behavior of the system as a whole. Without this sort of test coverage, we can’t imagine doing this type of engineering with any degree of confidence.

There are many places where we’ve set the bar high and have had a fair degree of success. This has been encouraging, but also (pleasantly!) surprising. Two things in particular come to mind: platform-independence and the distributed auction. To support a non-unix-container based platform in Diego today you literally just need to provide two things: a new Garden (the Go rewrite of the Warden container manager) backend that implements containerization in your target platform, and a small suite of binaries that get injected into your container. Seeing these come together and actually work has been quite affirming of our approach, and then we have an amazing, open community helping to enable containerization on other platforms.

Similarly, the problem of distributing application instances across our cluster of cells is an interesting research problem. We considered, at first, taking a “brain”-style approach where we have a single centralized coordinator (the “brain”) that knows about the state of the system as a whole and can allocate instances as they come in. This would more closely mirror the existing Cloud Foundry architecture. We know there are problems with this approach and while they aren’t insurmountable, we decided to try something very different: to have the individual Diego Cells each be capable of allocating and distributing instances across Diego cells as they come in. We’ve managed to accomplish this with Diego’s distributed auction. While the auction still has room to grow and remains an open research project, I’m encouraged by our ability to modify it and improve upon it as we gather more and more real-life experience with it.

There is a lot of movement in a next generation tools built for managing distributed systems. For example, I see a lot of similarities in the stuff happening in Mesos and Kubernetes ecosystems to some of the work happening in Cloud Foundry. What’s your understanding of that landscape and how do you see this all evolving?
There is plenty of room for many “winners” in this space, and it’s exciting to see a variety of competing options emerge with various approaches and levels of maturity. Diego, in particular, is emerging as a generic platform but was definitely envisioned (at least initially) to cater to its primary usecase, namely Cloud Foundry.

Apache Mesos is simply a communication infrastructure and protocol for enabling scheduling across distributed resources. Diego does that (admittedly with more specificity) and also does quite a lot more: containerization, log aggregation, routing, health-management, etc. Frameworks built on top of Mesos such as Marathon are more directly comparable to Diego.

Kubernetes is also comparable to Diego and brings some interesting innovation to the space (e.g., pods). Kubernetes started solving problems of generically coordinating containers. Our focus has been specifically supporting Cloud Foundry app deployment, but Diego is, I believe, headed towards becoming a generic solution as well—without sacrificing its ability and mandate to cater to Cloud Foundry.

The common theme in this new generation of tools is recognizing the uncertainty inherent in solving these distributed system problems at scale. The only thing that seems certain is that there are going to be more and more applications that need to operate reliably at scale.

What you you say to someone who was evaluating all the options for an organization’s next generation of projects about Diego?
The technology landscape provides a lot of options to take in and make sense of right now. All the options are making different assumptions and taking different philosophical approaches to solving similar problems. Cloud Foundry is committed to building the most enabling and operable deployment framework. It balances a set of assumptions and philosophies catered to enabling enterprise environments. Diego gives Cloud Foundry a more performant and extensible core going forward, but we have also abstracted that core. Diego can also solve problems for use cases outside of Cloud Foundry.

Of course, I’d want people to consider Diego’s driving principals of high-availability and fault tolerance. Diego will come batteries-included vis-a-vis routing, log aggregation and (soon!) application metrics and monitoring in a platform independent, modular, framework developed transparently as open source software. The fact that Diego is written in Go, with Go’s momentum and industry support (especially for systems programming and management) might also be a consideration. Diego development also has the full backing of Pivotal and the Cloud Foundry ecosystem. I know this is early, but I think there is reason believe Diego has a bright future helping organizations deploy and manage containers on a variety of platforms.

About the Author

Biography

More Content by Stacey Schneider
Previous
CloudBees Enterprise Jenkins for Pivotal CF
CloudBees Enterprise Jenkins for Pivotal CF

Today we’re announcing our partnership with CloudBees, the company behind Jenkins, to offer Pivotal custome...

Next
Ultimate Headphone Pairing (Pilot to co-pilot: I read you, over)
Ultimate Headphone Pairing (Pilot to co-pilot: I read you, over)

Here at Pivotal, we know pair programming is great. We strive to pair 100% of the time. Sometimes, a pair m...

How do you measure digital transformation?

Take the Benchmark