Git on EC2: Spot On!

This is the first in a series of posts about running your own Git server on an EC2 image. The posts include:

Git on EC2: Spot on!

Git on EC2: Provision the Server

Git on EC2: Booting the Server

Git on EC2: Using Python to Automate Provisioning

In my previous post about moving into the cloud, I said that I wanted to host my own Git repositories in the cloud, on Amazon’s Elastic Compute Cloud (EC2). In this post, I’ll discuss some of the reasons I chose Amazon as my cloud provider.

I chose EC2 for a number of reasons:

Using Spot Instances I could run the host quite cheaply.
I was not committed to any long term contract.
I would be able to take advantage of other Amazon Web Services, such as S3 and Glacier.
I wanted to learn how to administrate AWS hosted systems.

There are a number of options for hosting a server. The cheapest is a shared hosted web server. But most vendors won’t let you run arbitrary services on your web server instance (it is supposed to just serve HTTP). The next cheapest is a Virtual Private Server (VPS). In most case, a VPS will permit you to run just about any service on the server. The reality is that both methods require long term contracts (typically two years) to get the cheap prices. With long term contracts you can expect about 10 USD/month for a small (512MB RAM) VPS system; month to month pricing is closer to 20 USD/month.

Cloud hosting services do not typically require long term contracts; they are priced by use. In most cases, this will be more expensive than a long term contract with a VPS provider – but not more expensive than the month-to-month pricing of VPS vendors. Costs here (for compute and storage) are approximately 20 USD/month for small virtual machines.

Cloud platforms such as Heroku and Azure support SaaS solutions, but not IaaS; they are designed to host applications but not generic servers. AWS and Rackspace are examples of cloud platforms that support SaaS but also provide IaaS – meaning that I could host my own server within their cloud platform.

The feature that tipped the scale on the side of Amazon was Spot Instances. Spot Instances is a kind of marketplace for unused capacity. You set a price threshold for the most you are willing to pay to run the server. As long as capacity is available and the market price remains below your threshold, then your server will run. If demand and supply conspire to increase the price beyond your threshold, then your server is taken down.

Today the pricing for standard (non-spot) micro instance in the US East region is 0.020 USD/hour (14.40 USD/month), not including persistent storage costs. On the spot market a micro instance is running less than 0.004 USD/hour (2.88 USD/month). I cannot find any cloud service anywhere else that can beat that price.

Spot instances are great for solutions that do not require high availability. They are great for solutions that may require massive parallelism for small chunks of time (like doing batch processing once a day, or solving large scale one-time problems). And they are great for experimenting with AWS. I would not use them for a service that requires 100 percent up time or that is used by many people.

There are some downsides to using Spot Instances. They require that you use an EBS boot volume instead of instance storage; this means you are charged for the boot volume storage¹. They can be terminated any time. They may not be available in all regions or availability zones.

In practice, I’ve found that my spot instance is only rarely terminated. As I’m the only one using the services on the instance, and not using them very heavily, I’m unlikely to be critically impacted by any outage, and am more likely to not notice the outages at all.

The next post will discuss my OS choices and how I provision and manage the server. The actual serving of Git repos will be in a subsequent post.

In my case, the boot volume is 8 GB, which translates to 0.80 USD/month. ↩