Total Pageviews

Thursday, October 10, 2013

Thinking Inside the Sandbox

At The New York Times, the tech department has been evolving. I've only been here for about a year, but even in that time, I've witnessed some big changes. For a large - and very old - organization, I've found this company to be rather innovative and willing to take risks to stay on the cutting edge of technology. Internally, however, there are certain areas where we haven't pushed the envelope, and when it comes to the development environment we use to build our website, we have been behind for a while. When I started at The Times, our dev environment was a combination of mounted file systems from remote servers mixed with a poorly organized Subversion repository that was extremely difficult to work with. All of those systems were put in place for good reasons, but over time the reasons became less relevant and the solutions more outdated.

Attempts had been made to reform it, but nothing stuck because the timing was never quite right. But with the redesign of NYTimes.com, I had the perfect opportunity to change this fundamental aspect of how my fellow developers and I work.

What I Did

I considered several options when approaching the task of building a proper development environment:

  • Configure my computer to function as the web application server by using Apache, PHP, Node and whatever other packages my web app needs to run.
  • Spin up some cloud computing instance like an Amazon EC2, install my packages there and mount the instance as a local filesystem with a tool like Macfusion.
  • Use a virtual machine tool like VirtualBox to simulate the production environment that my app will run on, on top of my normal desktop operating system.

I chose to take the third route. When writing code for a large, enterprise-scale application, I want to be confident that the testing I'm doing during development will be a good indication of how my code changes will function in staging. And I want to be extremely confident that it will work as expected in the production environment. Using VirtualBox gave me this confidence.

Achieving Confidence

How can I build applications I trust will work in my production environment and avoid getting stuck searching for bugs I can't replicate? I eliminate variables (and I'm not talking about the programming kind). The variables I'm talking about result in inconsistencies between my development and production environments. They are the kind that cause my application to run smoothly in one environment and poorly in another. They are the reason the code I've written and tested on my machine mysteriously doesn't work on staging.

My code depends on various packages and software to run properly. When these dependencies are inconsistent across my development and production environments, I can't be certain that my application will function properly.

Case in Point

For example, if I built my application locally (and my computer had PHP 5.4 installed) and then deployed to my staging server (which had PHP 5.3), I might be in for a surprise. I wrote my array with the simple open and closed bracket syntax that is so familiar to JavaScript developers: [value1, value2]. While it works just fine on my machine, it suddenly throws an error when it's running on staging.

That's because this syntax was introduced in PHP 5.4. Creating arrays using an earlier version of PHP required me to use the array keyword: array(value1, value2).

Now imagine this type of problem everywhere in the codebase - it's the stuff of nightmares. One way to avoid these variables is to create a development environment using a virtual machine that replicates production.

How It Works

Using virtual machines to run an application is not a new idea. There are many tools out there to assist developers in making and managing a virtual machine. I used Vagrant, VirtualBox, Puppet and Git.

On production, we've been using CentOS as the operating system container for our application. Naturally, this should be the same operating system running on our virtual machine. As a first step, I put together a Vagrant box with that OS. Vagrant conveniently allows its users to package and share VM images with a few simple commands. Using Vagrant, I packaged a CentOS box running on top of VirtualBox's virtualization environment that I could easily share with my teammates by hosting it on a file server.

This was a crucial first step in ensuring we had consistent support for packages and configurations between environments. Now, when I installed software like Node or PHP on my virtual machine, I had much greater confidence that it would work exactly the same as it does on production. One big variable, eliminated.

But installing each individual package and manually setting each individual configuration on the VM is an unwieldy task. Not only that, but once I shared the VM with my team, how would I make configuration changes later on? Would I ask everyone to manually make the changes? Would I create a new image and share it with everyone, forcing my team to download a large VM image file each time I made a change? These are not very good options - luckily, there's a better way.

Using Puppet

Puppet is a tool that allows developers to declaratively configure Unix systems with files called Puppet manifests. With Puppet manifests, I can configure my VM however I see fit. Since Puppet manifests are just simple text files written in Puppet's own declarative language, I can track them in Git and share them with my teammates.

This means if I've configured Puppet to install a Node module like Grunt with the version 0.4.0, but I later need to update it to 0.4.1, I simply change the version number in the Puppet manifest, commit and push the change and ask the team to pull it and run Puppet on their sandboxes. Vagrant conveniently provides a simple command to run Puppet inside the VM: vagrant provision. It's a straightforward procedure which allows us to modify our sandboxes on demand as our needs evolve.

Vagrant provision

This is extremely powerful and obliterates a ton of variables. I can now have confidence that the packages and configurations on my sandbox are the same as my teammates', the staging environment and production. Hence, I can expect consistent support for my application across environments.

Finally, I wrote a Bash script to download and install all the prerequisites needed to get up and running. The Bash script installs Git, VirtualBox and Vagrant, clones our core application repositories and performs an assortment of other tasks to streamline the process of getting developers going in our workflow.

Bash script

And that was it! By completing these steps, I created a fully automated sandbox that mimicked production. I can easily share it, change it and, most importantly, rely on it with confidence.

Further Reading

New solutions are constantly springing up to help alleviate the problems in software development and deployment. Among them, I think Vagrant stands out as a winner. There are other tools, however, which are also worth paying attention to. Two of these are Docker and Packer. They both aim to fundamentally change how we deploy software, opting to deploy applications as containers or machine images, packaged with all their dependencies intact. It's a different paradigm for software delivery and one that I hope I'll get the chance to explore in the future.

If you're interested in learning more about Docker, which I highly recommend that you do, check out this post. Docker is sure to change how many organizations handle development and deployments, and the post helps to explain why.

To learn more about how to build a sandbox like the one I described, Nettuts+ has a great step-by-step article describing the whole process. You might be surprised to find that it's not all that hard to get up and running.

Closing Thoughts

Software development isn't easy. We can make things a little easier by adopting smart standards, writing well-organized code, communicating effectively within our teams and using the right tools. One of the most critical tools - and a great place to start - is a thoughtfully built sandbox.



No comments:

Post a Comment