Speeding up with Voldemort and Protobuf

In order to supply some of the analytics behind kaChing we have a some nice number crunching processes working over piles of financial data. The end results are displayed on the site (e.g. the find investors page).

The post is about one of such services which for the sake of this post we’ll refer to as the SuperCruncher. In the first iteration we grabbed the data straight from the DB into the SuperCruncher. Needless to say, relational databases are not handling stress nicely and a pattern of fast iteration over all of the DB kills it very fast. Results of first iteration: 30 hours of computing.

Since we couldn’t manage to squeeze more hours into the day we run a second iteration.

Voldemort & Protobuf for the rescue!

Though we have piles of data to compute, some of the data does not change though we do need to read it each time. We shoved the data into a Protobuf data structure which made the binary size considerably smaller, and pushed the protobuf into Voldemort. We used the Voldamort ProtoBufSerializer which provides an extremely simple way to store and use protobufs.

So essentially Voldemort in this case is used as a persistent cache to store normalized data. When reading the date the SuperCruncher first check it it exist in the naturally sharded Voldemort cluster and gets the delta from the DB.

Using Voldemort and Protobuf the I/O problem vanished, the SuperCruncher became four times faster (!) and bounced the performance ball back to the CPU. Cutting down CPU time is usually easier then cutting on I/O and indeed we managed to make the running time considerably faster in later iterations (hint: Joda Time & GC).