re:Invent 2014 – Highlights from Las Vegas

Last week AWS held their third annual customer conference. In between the pub crawls and one killer party were a ton of great tech sessions and product announcements. Here are my thoughts on a few of the standouts.

Scale

Allow me to play Captain Obvious for a minute: AWS is on a tear and continuing to experience hyper-growth. Here are a few stats that hammer home just how massive AWS is:

  • They recently topped 1M active customers (defined as non-Amazon customers who used AWS during the month)
  • 137% year-over-year data transfer growth to/from S3 object storage
  • 99% year-over-year growth in number of EC2 instance hours used
  • The amount of compute power AWS adds every day would be enough to run all of the Amazon.com e-commerce business when it was a $7B per year business in 2004
  • AWS has 5x the compute capacity of the next 14 top cloud providers combined (Gartner)

Here’s an interesting segment in which Andy Jassy proclaims AWS the “fastest growing multi-billion dollar enterprise IT company in the world,” with revenue growth > 40% year over year. Note in his slide that many of the traditional “enterprise vendors” are near flat or declining. A sea change is underway in the Enterprise.

This may sound like I’m an AWS fan-boy, but that’s not the intent. I’ve run various workloads on Google Compute, Azure, Rackspace, Heroku, Digital Ocean, and Joyent. They all have some interesting aspects. Here’s the point: AWS is in the driver’s seat in public cloud and showing no sign of letting up (let’s just hope others can keep them in check – pricing pressure would be nice).

EMR: Hadoop + EMRFS + S3 = Elastic Clusters

Netflix, Cloudera, and AWS all gave some interesting talks on EMR (Elastic MapReduce – the AWS managed Hadoop service). While EMR has been around for several years, recent enhancements in EMRFS have made certain uses really compelling. EMRFS enables the data nodes in your Hadoop cluster to read/write directly from the S3 object store, decoupling compute from storage.

This may sound a bit counter-intuitive to veterans of the Hadoop world. What about data locality?  A basic premise of Map Reduce is to parallelize compute by bringing the compute to the data. I’m sure this is a case of your mileage will vary, but Netflix claims only a 5-10% penalty in their Presto read workload when running data off of S3 versus local spinning disks. In Eva Tse’s talk she also gave some interesting insights into their transition from sequence files to Parquet and their usage of Presto for SQL-based analysis.

Several companies were talking about using S3 as their “source of truth” for event and analytics data, and using EMRFS to interop with it. Here are a few scenarios in which EMRFS could really be compelling:

  • You can now resize the compute nodes in your cluster to better meet workload demands, so no need to have static clusters sized to “top line” compute and data storage needs.
  • If you have periodic heavy ETL jobs, spin them up on a separate cluster of spot market instances so you’re only paying for that compute when needed.
  • Want to use one of the newer SQL or machine learning frameworks on Hadoop without disturbing your primary cluster? Spin up Spark, Impala, Presto, etc. on another EMR cluster as needed.
  • During upgrade cycles, when you want to, temporarily run clusters side-by-side.

Ian Meyers of AWS gave a great talk on EMR best practices, touching on file sizing, compression codecs (hint: use a streaming codec), and cluster resizing.

Aurora: MySQL Re-engineered for the Cloud

AWS announced Aurora at re:Invent – it’s a MySQL-compatible relational database engine. During the announcement they threw out some pretty impressive performance and scale numbers:

  • 100K SQL INSERTS/second from four client machines running 1,000 threads each
  • 500K SQL SELECTS second from one client machine running 1,000 threads

OK, they got my attention. But, being the skeptic on perf stats that I am, I wanted to dig in more. They’ve made several fundamental design changes to MySQL, breaking apart the monolithic process into three separate services for 1) SQL parsing/planning/transactions, 2) buffer cache, and 3) logging/storage. This session describes the architectural changes they made.

  • The Storage Layer is SSD-based, scale-out, partitions the workload to avoid hot-spots, supports data encryption at rest, and continuous backup.
  • By breaking out the Caching Layer into a separate process, you can restart your database in a few seconds and return with a warm cache immediately.
  • Aurora runs in a VPC, enabling network isolation from other AWS customers (and other portions of your AWS infrastructure).

Replication and Crash Recovery Changes

A common strategy for scaling reads with MySQL is to set up replication and direct read requests to the replicas. But in systems with heavy write activity, replicas often fall behind. This creates tricky scenarios for data consistency within your application and also compromises your high availability strategy. (If the master fails, do you try to fix it to avoid data loss? Do you flip the replica to master to get back online immediately? Do you extend your downtime and wait for the replica to catch up?) Replicas also put additional load on the master. In Aurora the responsibility for logging and replication is pushed down to a shared storage layer. They’re claiming up to 15 replicas with little load on the master and ~ 10 ms slave lag. In one of their tests a slave experienced only a 7.2 ms slave lag while 13.8K updates/sec were being applied on the master. Contrast that to ~ 2 second slave lag with only 2K updates/sec on the master (using standard MySQL 5.6 on the same underlying hardware).

They claim “instant” crash recovery due to the shared storage layer and the ability to apply redo logs to each storage segment on demand, in parallel, and asynchronously. Hmm … interesting approach. But when you’re running on public cloud without access to the underlying hypervisor and network, it’s really hard to simulate these scenarios and be assured that your recovery mechanisms will perform when needed. I was happy to see they exposed commands to enable you to simulate failures of a database node, network, and storage.

Compatibility?

AWS says Aurora is wire compatible with MySQL and any client libraries should interop with it just fine. But what about SQL dialect? From talking with the product team, I get the impression they’re not claiming 100% compatibility. I’m assuming SQL scripts in your admin bag of tricks that reference INFORMATION_SCHEMA or are InnoDB-specific options that will change due to the fundamental design differences in Aurora.

Putting it in Perspective

When RDS MySQL was first introduced five years ago it was interesting, but many intensive workloads simply couldn’t be run on it due to the lack of IO throughput and virtualization overhead. Whereas with an on-premise solution you could run on bare metal for CPU and network performance and connect to tricked-out storage solutions for high IOPS. In recent years AWS has evolved with Provisioned IOPS to enable SAN-level IO performance, and hypervisor/kernel improvements enabling EC2 instances that approach bare metal performance for CPU and network IO. So today you can provision RDS MySQL in configurations that begin to approach on-premise performance and scale. Now, fast forward to mid-2015. Let’s assume Aurora gets baked and solid for production level use. The interesting twist here is that Aurora could potentially outperform and scale better than even high-end tricked-out on-premise MySQL instances you can build.

Containers: All in on Docker

Linux containers have generated a lot of buzz in recent years. Why all the fuss? And why would you want to containerize workloads on AWS? Here are a few reasons:

  • Isolation and Dependency Management
  • Task Startup Latency – EC2 instances startup in minutes, containers can spin up in < 1 second
  • Resource Granularity – for some tasks/processes, dedicating an entire EC2 instance to them can be a waste. But “packing” multiple tasks onto an instance introduces dependency complexities. Containers offer a nice middle ground and help you drive higher utilization of your EC2 instances.

Several companies, such as Netflix and Airbnb, have been running large fleets of container-based production workloads on EC2 for a while, using Mesos as their cluster manager. But this can be a challenge. Both Netflix and DataDog had some interesting sessions speaking to the challenges of running containers at scale in production.

To help ease these challenges, AWS decided to add container support directly into their platform. The newly announced ECS (EC2 Container Service) provides APIs and cluster management for deploying a fleet of Docker containers on your EC2 instances. ECS ties into Docker repositories and allows you to define tasks (tasks specify containers and memory/CPU resource requirements for those containers). Then ECS handles allocating those tasks among your EC2 instances.

And while the ECS scheduler doesn’t appear yet to have the sophistication of cluster frameworks such as Marathon or Chronos, the team has opened up APIs to enable plugging in your own resource scheduler.  Smart move.

ECS already ties into VPCs, Security Groups, and network ACLs to help secure your container-based deployments. In the near term they also plan to integrate into more AWS services, including ELB, CloudWatch, CloudFormation, and CoreOS AMIs.

Performance

Several sessions also talked about how to reduce the virtualization tax on EC2, and squeeze more performance from your instances.

With C3, R3, and I2 instances, you can enable Enhanced Networking. It uses SR-IOV (single root I/O virtualization), in which each guest VM gets its own hardware-virtualized NIC. This can have a dramatic improvement on throughput and latency, and will produce less jitter (note the scale is logarithmic in that chart).

Brendan Gregg from Netflix gave a great session on choosing the right instance type and performance tuning your EC2 instances. He helped demystify the various Xen modes and explained how to make sure you’re using the right one. Hint: Use newer kernels that support PV-HVM like recent Amazon Linux AMIs. Oh, and be sure to check out Brendan’s special tools.

Then, for those most desperate times when you just can’t throw enough hardware at a problem, AWS has introduced a new C4 instance type, with 36 vCPUs powered by a custom Intel 2.9 GHz E5 Haswell processor. Combine that with 10 Gbps networking to 20,000 IOPS SSD-backed EBS volumes and that’ll be one heck of a screamer!

I’d hate to see the bill for a large fleet of C4 instances, but it’s nice to know it’s an option. And that’s one of the transformative things about AWS: the ability to experiment quickly and pay as you go. Want to take that fancy new C4 instance backed by 20,000 IOPS of storage for a spin?  It’s a relatively cheap and quick effort to get “real” answers to just how your workload will perform on that type of infrastructure.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s