Bronto Hosts The Iron Yard

Last week, I was privileged to speak to the students of The Iron Yard Academy about my experience in front-end development. Bronto has a long-standing partnership with the expanding code school in the American Underground campus, and each time we host student tours of our space or sit in the audience for their Demo Day event, it’s exciting to see their drive and innovative spirit firsthand.

The demand for the intensive program that The Iron Yard offers is easy to understand. With the explosion of new trends in the world of programming, along with a job market that favors job seekers with in-demand skills, it’s important to have a curriculum and project work that reflect the current environment.

Things have changed dramatically since I started back in the dot-com years, especially with front-end development. I talked about learning JavaScript during a time when it was considered quite the opposite of the popular language it is today. The resources and tooling were sparse, and it was challenging to build interactive features. Still, since JavaScript is a standardized language, many of the core interfaces, such as MouseEvent, have retained the same basic support over the years. Below is an image drag and drop sample pulled from the archives of dynamicdrive.com. With just a few minor updates, I was able to run it in this pen. You can also check out the original version from way back in ’99.

codepen-animated

The biggest difference with front-end development today is choice. Sure, you can still develop powerful features with bare-metal JavaScript, HTML and CSS, but there are so many great solutions out there to aid your workflow. It’s hard to resist including them in your circle despite the tradeoffs in managing the additional overhead.

At Bronto, we use a lot of Backbone and jQuery. Both are sufficient for implementing sophisticated features, but we also leverage other technologies in our front-end stack:

Screen Shot 2015-03-09 at 8.18.54 AM

The real fun began after my presentation. The students had such a broad range of interesting questions regarding business strategy, company philosophy, architectural concerns and future trends. I was blown away by their intelligence and enthusiasm, and I’m even more excited to see their ideas at the next Demo Day!

Improving Automation in Systems Engineering

When I joined Bronto in 2013, I felt we had a reasonably modern procedure for provisioning new systems:

  1. Work with the requesting team to determine the resources needed.
  2. Define the system resources in Foreman and slather on some puppet classes.
  3. Push the shiny ‘Build’ button and wait.
  4. Tackle all of the fiddly little bits that Foreman wasn’t handling at the time.
  5. Complete peer review and system turnover.

This request might take a business day or two to process, longer if something languished in peer review, exposed some technical debt, or just lead to a yak shave. It was a ‘good enough’ solution in an environment where these sorts of requests were infrequent, and we were well aware of the rough edges that needed to be filed off this process when the time was right.

Knowing that our developers were pushing to transition to a more service-oriented architecture and break down the remaining pieces of the old, monolithic code base, we knew it was time to streamline this process before it became a pain point. Requests for new systems were going to be more frequent and more urgent, and we needed to get ahead of the problem by devoting the time to make things better.

After dredging up the relevant improvement requests from our backlog, we tackled the task of filing off those rough edges by:

Writing custom Foreman hooks: This helped with the worst of the manual tasks and freed us from the pain of having to manually update Nagios, LDAP, and any number of additional integration points within the infrastructure.

Automating the peer review process: Borrowing the idea of test-driven development, we’ve finally reached the point where we have test-driven infrastructure. By writing a set of system tests and launching them from another Foreman hook, we were able to replace manual peer review with automation. Results are then announced in a chat room for cross-team visibility.

Writing an ad hoc API for RackTables: RackTables was a great early solution, but we’re approaching the point where it’s no longer a good fit. Although we’re not quite ready for a new solution to datacenter asset management, being able to programmatically twiddle the information in RackTables was a win.

Creating Architect to batch-provision virtual machines: Since VMs tend to be requested in groups to create a resilient service, we wrote a tool to automate away the repetitive tasks. Architect gathers the system requirements, creates a configuration file, and then selects suitable hypervisors and builds out each of the systems. This has been a huge win when it comes to fulfilling requests for 10 or more systems at once.

While there are additional automation improvements we’d like to make, these efforts have allowed us to more rapidly respond to the needs of our development teams and generate systems in minutes instead of days. System provisioning is a core function of our team, and we are always looking for ways to improve our abilities in that area.

February TriHUG: Hive on Spark

Bronto hosted February’s Triangle Hadoop User Group (TriHUG), featuring Szehon Ho from Cloudera talking about Hive on Spark.

TriHUG FebruaryHive was the first tool to enable SQL on Hadoop and, up until this point, has primarily used MapReduce as its execution engine. Szehon explained the motivation behind the Hive on Spark effort and the benefits that the team contributing to it is seeing from Spark.

While there are a variety of newer SQL on Hadoop engines (Impala, SparkSQL, Presto, Drill) that offer improved performance, many organizations have large investments in Hive. Hive on Spark is an effort to modernize the execution engine underneath Hive, while retaining full HiveQL and metastore compatibility. Their goal was to improve the execution speed of Hive while retaining a smooth upgrade path for existing Hive users.

Typically, Hive executes its queries on top of Hadoop’s MapReduce framework, but SQL statements often translate into multiple Map and Reduce stages. At the end of each stage, the reducer “spills” the data down to disk (HDFS) to be reloaded by later map stages, resulting in much higher latency. Whereas with Spark, the in memory DAG execution model allows for multiple transformations on the data without spilling to disk between each stage.

Another benefit is that the Hive query planner now has a more expressive execution engine on which it can run queries. At the core of Spark is the Resilient Distributed Dataset (RDD). RDDs support a much broader set of transformations than just Map and Reduce.

Check out the slides for more details.

Many thanks to Szehon for coming out to Bronto to speak with TriHUG!

An Introduction to React With Nate Hunzaker

Bronto was excited to host the January 2015 Triangle Javascript meetup, An Introduction to ReactNate Hunzaker of Viget Labs spoke to a full crowd on the merits of the virtual DOM, immutability, and the hype around Facebook’s new application architecture, Flux.

Nate knocked it out of the park. He kept the crowd engaged with an appropriate mix of code, real world examples, and a bit of fun. The follow up Q&A was as informative as the main presentation. Find the slide deck here.

Continue reading

The Lazy Engineer’s Guide to Using Mock

In my previous post, I laid out four rules to yield superior RPMs and reduce headaches in their deployment. In this post, I’ll go into more detail around the infrastructure we’ve put in place here at Bronto to make following the rules easy, especially rule three: building safely using Mock.

Larry Wall advocates that laziness is a primary virtue of good technologists. As a result, we want to do as little work as possible, taking advantage of the work of others whenever we can. At Bronto, we rely on the efforts of the CentOS Project for the operating system for our production servers and Puppet Labs for the software that manages those servers. For building packages, the path of least resistance would be a puppet module that manages building RPMs using Mock on RHEL-like machines, with bonus points if we don’t have to write the module ourselves. Fortunately for us, a puppet module already exists that does pretty much what we want. It provides a huge head start to building RPMs, and we use it as a base for our build servers with only minimal modifications.

Out of the box, Jeff McCune’s mockbuild puppet module sets up the mockbuild user and group, installs the needed packages, and sets up the usual RPM-building directories (BUILD, RPMS, SRPMS, SOURCES, and SPECS) in mockbuild’s home directory of /var/lib/mockbuild. With normal usage, you’d copy your spec file into SPECS, your code into SOURCES, and then run mock to actually build the RPMs, doing this all by hand.

Obviously, we want to put some automation around this process to increase efficiency and decrease error rates. We also want to make sure that our spec files and local patches are checked into version control and that we have a known place for downloaded source code. Since we have a well-defined code review and deployment process for our puppet code, we decided to check our code in there and deploy to the build servers using puppet. We also have a puppet file server set up to hold the original sources since we don’t want to check tarballs into our version control system. We accomplished this by adding the following puppetry:

  file { "${home}/sources":
    ensure  => directory,
    owner   => 'mockbuild',
    group   => 'mockbuild',
    mode    => '0644',
    recurse => true,
    purge   => true,
    force   => true,
    source  => 'puppet:///software/mockbuild/sources/',
  }

  file { "${home}/files":
    ensure  => directory,
    recurse => true,
    purge   => true,
    force   => true,
    source  => "puppet:///modules/mockbuild/${::lsbmajdistrelease}",
  }

This takes care of getting the files onto the build servers, but we still need to copy the spec files and the sources into the right places to actually build the RPMs. We do this via a small Makefile in /var/lib/mockbuild:

##
## Automation for building RPMs.
##
## Synopsis:
##
##  sudo -Hu mockbuild make <project_name>
##  sudo -Hu mockbuild make clean
##
## To create i386 RPMs on x86_64 hosts, you must specify an i386 build environment
##
##  sudo -Hu mockbuild make <project_name> ENV=epel-5-i386
##
## Author:
##
##  Kevin Kreamer <kevin.kreamer@bronto.com>
##

MOCKBUILD_HOME=/var/lib/mockbuild
ENV=default

RPMS=$(shell find $(MOCKBUILD_HOME)/files -mindepth 1 -maxdepth 1 -type d -exec basename {} \;)
.PHONY: clean $(RPMS)

$(RPMS):
        /usr/bin/find $(MOCKBUILD_HOME)/sources/$@ $(MOCKBUILD_HOME)/files/$@ -maxdepth 1 -type f -not -name \*.spec -exec cp {} $(MOCKBUILD_HOME)/rpmbuild/SOURCES \;
        cp $(MOCKBUILD_HOME)/files/$@/$@.spec $(MOCKBUILD_HOME)/rpmbuild/SPECS
        sudo /usr/sbin/mock -r $(ENV) --resultdir $(MOCKBUILD_HOME)/rpmbuild/SRPMS/ --buildsrpm --spec $(MOCKBUILD_HOME)/rpmbuild/SPECS/$@.spec --sources $(MOCKBUILD_HOME)/rpmbuild/SOURCES/
        sudo /usr/sbin/mock -r $(ENV) --resultdir $(MOCKBUILD_HOME)/rpmbuild/RPMS --rebuild $(MOCKBUILD_HOME)/rpmbuild/SRPMS/$@*.src.rpm

clean:
        /bin/rm -Rf $(MOCKBUILD_HOME)/rpmbuild/SOURCES/* $(MOCKBUILD_HOME)/rpmbuild/SPECS/* $(MOCKBUILD_HOME)/rpmbuild/SRPMS/* $(MOCKBUILD_HOME)/rpmbuild/BUILD/*
        /bin/rm -f $(MOCKBUILD_HOME)/rpmbuild/RPMS/*.src.rpm $(MOCKBUILD_HOME)/rpmbuild/RPMS/*.log

with its associated puppetry:

  file { "${home}/Makefile":
    ensure => 'present',
    owner  => 'mockbuild',
    group  => 'mockbuild',
    mode   => '0644',
    source => 'puppet:///modules/mockbuild/Makefile',
  }

The Makefile assumes that each spec file lives in a subdirectory of the same name along with its associated sources. It copies them into SPECS and SOURCES as appropriate and then kicks off mock to build the RPMs. make clean is provided to clean out the rpmbuild directory for building other RPMs. The Makefile is written extensibly to automatically handle additional subdirectories with additional spec files.

With this infrastructure in place, building an RPM with all of Mock’s goodness is just a simple make command away.

TattleTail, the Event Sourcing Service

TattleTail is a service that we’ve built to record the events flowing throughout Bronto’s platform. TattleTail records these events to allow any team to analyze our event flow without impacting customer-facing systems.

Events at Bronto

At Bronto, our services use an event-driven architecture to respond to application state changes. For example, our workflow system subscribes to all “Contact Added” events that are produced from the contact service. Our customers can then create “Welcome Series” workflows that send each new contact a series of time-delayed messages.

In reality, we have dozens of event types that a growing number of services generate and subscribe to. Organizing the system around events encourages decoupled services and allows us to quickly add new services as required. This event stream is managed by our internal message broker called Spew, which is a subject for another post.

More Events, More Problems

Event-based architectures, and more generally event-based services, don’t come without a cost. The most obvious risk is the loss of events. This can happen because of a message queue failure or a bug in application code, and it is guaranteed to happen at some point.

Tracing the path from an event source to downstream state changes is difficult. Events can also be received out of order, which can lead to surprising issues with application state.

While adding new services is straightforward (just subscribe to the events you want), we still have to contend with the “cold start” problem when we launch a service. When a service comes online, it generally isn’t useful if it has no data, and no one wants to wait months to accumulate events before they deploy.

The Solution

One of the simplest ways to deal with the challenges introduced by events is to keep a copy of everything that happens in the system. This isn’t a new idea. Martin Fowler called it Event Sourcing, and the Lambda Architecture uses an immutable record of all events as the primary data store.

Having a record of everything in your application can be great for debugging since events can be replayed through a development environment to identify a bug. It can also make fixing bugs easier by allowing state to be recomputed and repaired after the fact. (This really only works if your application uses immutable data; trying to repair a global counter would not be enjoyable.)

Additionally, an event record helps with the cold start issue. A service can just process old events in batch to jump-start its data store. Existing services can use the record to add features, such as indexing new fields or tracking metrics on different dimensions.

Our first implementation of event sourcing relied on Apache HBase as a durable queue. Events we wanted to record were written to a table, and a nightly MapReduce job wrote the data to the Hadoop File System (HDFS) as SequenceFiles. This solution was adequate, but the output format wasn’t easy to consume, so using it for repairing data wasn’t feasible.

Enter TattleTail

Finally, we arrived at a solution we named TattleTail, which combines our message broker, Spew, and Apache Flume. TattleTail is really quite simple. A custom Flume source (producer) subscribes to Spew events, and an HDFS sink (consumer) efficiently writes batches of events to HDFS.

The most complicated part of logging the events involves partitioning events in HDFS for efficient retrieval. We don’t want to scan the entire repository when we need to find the last 30 days worth of data. Every event in Spew includes a header that describes the message channel it came from, and the HDFS sink from Flume supports using this event metadata to write to different directories. All the Flume source does is extract the headers from the Spew event and copy them to Flume’s event format. The HDFS sink handles the rest.

TattleTail makes several improvements over our previous HBase backed system. First, it receives every event, not just a subset that is difficult to expand. This allows us to get a complete record of events which should make it usable by a wider variety of services.

More importantly, TattleTail uses the same serialization as all the messages flowing through Spew. (Our previous solution used a custom versioned TSV format.) This means that application code that already receives Spew events should be able to consume an HDFS-backed stream with little difficulty.

Looking Forward

TattleTail offers several benefits to Bronto. Teams will be able to access production data for debugging, data repair, and ad hoc analysis. From a business perspective, we can search for patterns in customer behavior and find opportunities for new features.

There is still work to be done to realize these benefits. Right now, TattleTail’s data is only accessible via MapReduce jobs. We hope to add Pig support soon, and we are exploring the use of Spark for data analysis. Another area for expansion is the use of TattleTail to keep a copy of our events offsite as part of a disaster recovery mechanism with shorter mean time to recovery.

Meetups @ Bronto

What is a meetup you ask?  meetupThe general idea is one of community.  It facilitates a way for people of common interests to come together, share ideas, network, and have fun doing it. These principles represent core values here at Bronto. With continued growth in our space here at American Tobacco Campus, we have welcomed the greater community to share in it.

This post highlights many of those groups that are working to help make a difference in the Triangle through education.

Continue reading