The Forklifted Application

July 31, 2015 Josh Long

 

sfeatured-cloud-foundryThis content is part of the upcoming book on Cloud-Native Java that I am writing with Kenny Bastani on building Cloud-Native applications with Spring and Cloud Foundry. Josh
 
 
 
 

The Contract

Community and customers alike are moving as much of their workloads to platforms like Cloud Foundry as possible. Cloud Foundry aims to improve velocity by reducing or at least making consistent the operational concerns associated with deploying and managing applications. Cloud Foundry is an ideal place to run online web-based services and applications, service integrations and back-office type processing.

Cloud Foundry optimizes for the continuous delivery of web applications and services by making assumptions about the shape of the applications it runs. The inputs into Cloud Foundry are applications—Java .jars, Ruby on Rails applications, Node.js applications, etc. Cloud Foundry provides well-known operational benefits (log aggregation, routing, self-healing, dynamic scale-up and scale-down, etc.) to the applications it runs. There is an implied contract between the platform and the applications that it runs. This contract allows the platform to keep promises about the services it provides.

Some applications may never be able to meet that contract. Other applications might be able to, albeit with some soft-touch adjustments. We will focus on the various strategies and soft-adjustments for coercing legacy applications to run on Cloud Foundry.

The goal isn’t, in this case, to build an application that’s native to the cloud. It’s to move existing workloads to the cloud to reduce the operational surface area, increase uniformity and improve velocity. Once an application is deployed on Cloud Foundry it is at least as well off as it was before and now you have one less snowflake environment.

I distinguish this type of workload migration—application forklifiting—from building a Cloud-Native application. Much of what wetalk about these days is about building Cloud-Native applications—applications that live and breathe in the cloud (they inhale and exhale as demand dictates) and that fully exploit the platform. That journey, while ideal and worth taking assuming the reward on investment is tangible, is the subject of a different, much longer discussion.

Applications, broadly speaking, are the sum of their environment and their code. In this post, we’ll look at strategies for moving a legacy Java application from some of the environments that legacy Java applications typically live in. Many of these strategies can be generalized to apply to other languages and runtimes.

Migrating Application Environments

There are some qualities that are common to all applications, and those qualities—like RAM and DNS routing—are configurable directly through the cf tool, various dashboards, or in an application’s manifest.yml file. If your application is a compliant application that just needs more RAM, then you should be all set.

Support For Different Languages And Runtimes With Out-of-the-Box Buildpacks

Things sometimes just aren’t that simple, though. Your application may run in any number of slowflake environments, whereas Cloud Foundry makes very explicit assumptions about the environments its applications run in. These assumptions are encoded to some extent in the platform itself and in buildpacks. Buildpacks were adopted from Heroku. Cloud Foundry and Heroku don’t really care what kind of application they are running—instead they care about Linux containers that are ultimately operating system processes. Buildpacks tell Cloud Foundry what to do given a Java .jar, a Rails application, etc. A buildpack is really a set of callbacks—shell scripts that respond to well-known lifecycle hooks—that the runtime will use to ultimately create a Linux container to be run. This process is called staging.

Cloud Foundry provides many out-of-the-box system buildpacks. Those buildpacks can be customized or even completely replaced. Indeed, if you want to run an application for which there is no existing buildpack provided out of the box (by the Cloud Foundry community, Heroku or Pivotal) then at least it’s easy enough to develop and deploy your own. There are buildpacks for all manner of environments and applications out there, including one called Sourcey that simply compiles native code for you!

Customizing Buildpacks

These buildpacks are meant to be sensible and adaptable. As an example, Java/JVM buildpack supports .wars (which it’ll run inside of a up-to-date version of Apache Tomcat), Spring Boot-style executable .jars, Play applications, Grails applications, and much more.

If the system buildpacks don’t work for you and you want to use something different, you only need to tell Cloud Foundry where to find the code for the buildpack using the -b argument to cf push:

cf push -b https://github.com/a/custom-buildpack.git#my-branch custom-app

Often an existing buildpack can be made to do what you want to do, but with some minor tweaks. In the worst case, you can always just fork the code and make the tweak and specify the URI for the forked buildpack on cf push. Some buildpacks lend themselves to customization. The Java buildpack—which was originally developed by the folks at Heroku and which people working on Cloud Foundry have since greatly expanded—supports configuration through environment variables. The Java buildpack provides default configuration in the config directory for various aspects of the buildpack’s behavior. You can override the behavior by providing an environment variable prefixed with JBP_CONFIG_ of the same name as the configuration file, sans the .yml extension. Thus, borrowing an example from the excellent documentation, if I wanted to override the JRE version and the memory configuration (which lives in the config/open_jdk_jre.yml file in the buildpack), I might do the following:

cf set-env custom-app JBP_CONFIG_OPEN_JDK_JRE [jre: {version: 1.7.0_+}, memory_calculator: {memory_heuristics: {heap: 85, stack: 10}}]

Containerized Applications

Applications in the Java world that were developed for a J2EE / Java EE application server tend to be very sticky and hostile to migration outside of that application server. Java EE applications—for all their vaunted portability—use class loaders that behave inconsistently, offer different subsystems that themselves often require proprietary configuration files and—to fill in the many gaps—they often offer application server-specific APIs. If your application is completely intractable and these various knobs and levers don’t afford enough wiggle room to make the jump, there may still be hope yet! Be sure to look through all the community buildpacks. There are buildpacks that stand up IBM’s WebSphere (with contributions from IBM, since they have a PaaS based on Cloud Foundry!) and RedHat’s WildFly, as well.

Cloud Foundry “Diego” also supports running containerized (Docker, with other containers to come) applications. This might be an alternative if you’ve already got an application containerized and just want to deploy and manage it with the same toolchain as any other application. We’ve extracted some of the interesting scheduling and container-aware features of the forthcoming Cloud Foundry into a separate technology called Lattice. Lattice is Cloud Foundry by subtraction. If nothing else, you can use it to containerize and validate your existing application. We’ve even put together some nice guides on containerizing your Spring applications and then running them on Lattice!

We’ve run the gamut from common-place configuration, to application- and runtime-specific buildpack overrides to opaque containerized applications. I start any attempts to forklift an application in this order, with simpler tweaks first. The goal is to do as little as possible and let Cloud Foundry do as much as possible.

Soft-Touch Refactoring To Get Your Application Into The Cloud

In the last section we looked at things that you can do to wholesale move an application from it’s existing environment into a new one without modifying the code. We looked at techniques for moving simple applications that have fairly common requirements all the way to very exotic requirements. We saw that there are ways to all but virtualize applications and move them to Cloud Foundry, but we didn’t look at how to point applications to the backing services (databases, message queues, etc.) that they consume. We also ignored, for simplicity, that there are some classes of applications that could be made to work cleanly on Cloud Foundry with some minor, tedious, and feasible changes.

It always pays off to have a comprehensive test suite in place to act as a harness against regressions when refactoring code. I understand that—due to their very nature—some legacy applications won’t have such a test suite in place.

We’ll look mostly at soft-touch adjustments that you could make to get your application working, hopefully with a minimum of risk. It goes without saying, however, that—absent a test suite—more modular code will isolate and absorb change more readily. It’s a bitter irony then that the applications most in need of a comprehensive test-suite are the ones that probably don’t have it: large, monolithic, legacy applications. If you do have a test suite in place, you may not have smoke tests that validate connectivity and deployment of the application and its associated services. Such a suite of tests is necessarily harder to write but would be helpful precisely when undertaking something like forklifting a legacy application into a new environment.

Talking to Backing Services

A backing service is a service (databases, message queues, email services, etc.) that an application consumes. Cloud Foundry applications consume backing services by looking for their locators and credentials in an environment variable called VCAP_SERVICES. The simplicity of this approach is a feature: any language can pluck the environment variable out of the environment and parse the embedded JSON to extract things like service hosts, ports, and credentials.

Applications that depend on Cloud Foundry-managed backing services can tell Cloud Foundry to create that service on-demand. Service creation could also be called provisioning. Its exact meaning varies depending on context; for an email service it might mean provisioning a new email username and password. For a MongoDB backing service it might mean creating a new Mongo database and assigning access to that MongoDB instance. The backing service’s lifecycle is modeled by a Cloud Foundry service broker instance. Cloud Foundry service brokers are REST APIs that Cloud Foundry cooperates with to manage backing services.

Once the broker is registered with Cloud Foundry, it is available through the cf marketplace command and can be provisioned on demand using the cf create-service command. This service is ready to be consumed by one or more applications. At this point the service is a logical construct with a logical name that can be used to refer to it.

Here’s a hypothetical service creation example. The first parameter, mongo, is the name of the service. I’m using something generic here but it could as easily have been New Relic, or MongoHub, or ElephantSQL, or SendGrid, etc. The second parameter is the plan name—the level and quality of service expected from the service provider. Sometimes higher levels of service imply higher prices. The third parameter is the aforementioned logical name.

cf create-service mongo free my-mongo

It’s not hard to create a service broker (we’ll look at that in a subsequent blog post!), but it might be more work than you need. If your application wants to talk to an existing, static service that isn’t likely to move and you just want to point your application to it, then you can use user provided services. A user-provided service is a fancy way of saying “take this connection information and assign a logical name to it and make it something I can treat like any other managed backing service.”

A backing service—created using the cf ceate-service command or as a user-provided service—is invisible to any consuming applications until it is bound to an application; this adds the relevant connectivity information to that application’s VCAP_SERVICES.

If Cloud Foundry supports the backing service that you need—like MySQL or MongoDB—and if your code has been written in such a way that it centralizes the initialization or acquisition of these backing services—ideally using something like dependency injection (which Spring makes dead simple!)—then switching is a matter of rewiring that isolated dependency. If your application has been written to support 12 Factor-style configuration where things like credentials, hosts, and ports are maintained in the environment or at least external to the application build then you may be able to readily point your application to its new services without even so much as a rebuild. For a deeper look at this topic, check out this blog on 12 Factor app style service configuration.

Often, however, it’s not this simple. Classic J2EE / Java EE applications often resolve services by looking them up in a well-known context like JNDI. If your code was written to use dependency injection then it’ll be fairly simple to simply to rewire the application to resolve its connection information from the Cloud Foundry environment. If not, then you’ll need to rework your code and—ideally—do so by introducing dependency injection to insulate your application from further code duplication.

Achieving Service Parity With Spring

In this section, we’ll look at some things that people tend to struggle with when moving applications to lighter weight containers and—by extension—the cloud. This is by no means an exhaustive list.

Cloud Foundry (and indeed the majority of clouds) are HTTP-first. It supports individually addressable nodes, and it even now has support for non-routable custom ports, but these features work against the grain and aren’t supported in every environment. If you’re doing RPC with RMI/EJB, for example, then you’ll need to tunnel it through HTTP. Ignoring for now the wisdom of using RPC, it’s easier if you do RPC through HTTP. There are many ways to do this including XML-RPC, SOAP (bleargh!), and even Spring’s HTTP Invoker service exporters and service clients which funnels RMI payloads through HTTP. This last option is surprisingly convenient and usable.

Cloud Foundry (and most cloud environments in general) don’t do well with multicast networking. If your application relies on multicast networking then things will be easier if you can find some other way to solve the problem. One use case commonly associated with multicast networking is HTTP session replication. You can get HTTP session replication, foregoing multicast networking, by using Spring Session. Spring Session is a drop in replacement for the Servlet HTTP Session API that relies on an SPI to handle synchronization. The default implementation of this SPI uses Redis for distribution, instead of multicast. You just install Spring Session, you don’t have to do anything else to your HTTP session code. It in turn writes session state through the SPI. Redis, for example, is readily available on Cloud Foundry. As multiple nodes spin up, they all talk to the same Redis cluster, and benefit from Redis’ world-class state replication. Spring Session gives you a few other features too, for free. Read this blog for more.

Cloud Foundry doesn’t provide a (durable) file system. If your application requires a file system for durable persistence, consider using something like a MongoDB GridFS-based solution or an Amazon Web services-based S3 solution. If your application’s use of the file system is ephemeral—staging file uploads or something—then you can use the Cloud Foundry application’s temporary directory but keep in mind that Cloud Foundry doesn’t guarantee anything. You don’t need to worry about things like where your database lives and where the application logs live, though; Cloud Foundry will handle all of that for you.

I don’t know of a good JMS solution for Cloud Foundry. It’s invasive, but straightforward, to rework most JMS code to use the AMQP protocol, which RabbitMQ speaks. If you’re using Spring, then the primitives for dealing with JMS or RabbitMQ (or indeed, Redis’ publish-subscribe support) look and work similarly. RabbitMQ and Redis are available on Cloud Foundry.

If you’re application requires distributed transactions, using the XA/Open protocol and JTA, it’s possible to configure standalone XA providers using Spring and it’s downright easy to do so using Spring Boot. You don’t need a Java EE-container hosted XA transaction manager.

Cloud Foundry terminates HTTPS requests at the highly available proxy that guards all applications. Any call that you route to your application will respond to HTTPS as well. If you’re using on-premise Cloud Foundry, you can provide your own certificates centrally.

Does your application use SMTP/POP3 or IMAP? If you are using email from within a Java application, you’re likely using JavaMail. JavaMail is a client Java API to handle SMTP/POP3/IMAP based email communication. There are many email providers-as-a-service. SendGrid, which is supported out of the box with Spring Boot 1.3, is a cloud-powered email provider that you use with JavaMail.

Simialarly, if you have a strong need for a centralized identity provider to handle questions about users, roles, and authority, you might find that Stormpath is a worthy hosted service that can act as a facade in front of other identity providers, or be the identity provider itself. There’s even a very simple Spring Boot and Spring Security integration!

Next Steps

Hopefully, there aren’t any! The goal of this post was to address common concerns typical of efforts to move existing legacy applications to the cloud. Usually that migration involves some combination of the advice in this post. Once you’ve made the migration, have a strong cup of water! You’ve earned it. That’s one less thing to manage and worry about.

If you have time and interest, there is quite a lot to do in making the move to Cloud-Native applications. This very blog is routinely full of such posts.

Further Cloud-Native Reading

 

About the Author

Josh Long is a Spring Developer Advocate at Pivotal. Josh is a Java Champion, author of five books (including O'Reilly's upcoming "Cloud Native Java: Designing Resilient Systems with Spring Boot, Spring Cloud, and Cloud Foundry") and three best-selling video trainings (including "Building Microservices with Spring Boot Livelessons" w/ Phil Webb), and an open-source contributor (Spring Boot, Spring Integration, Spring Cloud, Activiti and Vaadin).

Follow on Twitter Visit Website More Content by Josh Long
Previous
Pivotal Cloud Foundry Ignites Innovation For CenturyLink
Pivotal Cloud Foundry Ignites Innovation For CenturyLink

The older and larger the company, the more difficult it is to adapt to today’s software development environ...

Next
This Month in Data Science: July 2015
This Month in Data Science: July 2015

In the month of July, education programs for data science gained traction, biologists considered how the bi...