Building Mobile Infrastructure with HBase

Recently, I had the opportunity to speak at HBase Con 2012 about Urban Airship’s experience deploying a high write environment in HBase, including mistakes to avoid, and tips and open source utilities you can use to diagnose and debug performance bottlenecks.

The performance demands of mobile infrastructure are similar to any large web site with the added complexities of serving multiple different mobile OSes, maintaining open connections to hundreds of millions of devices and a frontend API that sustains thousands of requests per second. But with mobile, if your system is slow or inefficient, you waste end users’ battery life.

We needed a really big distributed database cluster that wouldn’t fall over with the load we were getting, including tremendous spikes like Christmas where traffic increases 600% overnight. And, like others in the industry, after trial-and-error, we have found that HBase offers operational ease and a low latency, high throughput system with known scalability characteristics.

Check out the video of my talk, or catch Dave Revell and me at OSCON where we’ll talk about HBase and other open source architectural components of Urban Airship’s massively scalable messaging infrastructure. We’ve also open sourced a number of tools that you may find useful, including:

  • StatsHTable is a tool for measuring the performance of HBase clusters in real time. It works by wrapping your HTable object with a StatsHTable object that measures the latency of every call to the underlying HTable.
  • Data Cube - A data cube is an abstraction for counting things in complicated ways. This project is a Java implementation of a data cube backed by a pluggable database backend.
  • HBackup transfers large files between HDFS and S3 and keeps them up to date.

Finally, if you’d like to join a team that is pushing the boundaries of scalability as we drive towards supporting billions of connected devices, please check out our jobs page.